Using AI-generated real-world objects to uncover the structure of visual memory
Poster Presentation: Sunday, May 18, 2025, 2:45 – 6:45 pm, Pavilion
Session: Visual Memory: Capacity and encoding of working memory
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Jiachen Zou1, Chen Wei1,3, Quanying Liu1, Maria Robinson2; 1Southern University of Science and Technology, 2University of Warwick, 3University of Birmingham
Studying people’s memory for real-world concepts remains a key challenge in vision science. Here, we introduce a novel AI-driven generative model capable of dynamically creating visual stimuli with new concepts, enabling the study of human perception and memory with previously unattainable experimental control (Wei, et al., 2024). In two visual working memory experiments we generate "morph wheels" of real-world objects (e.g., animal and plant morphs), with smooth interpolations between different object instances. We demonstrate that the similarity structure predicted by the model can be used to make parameter-free predictions of people's memory errors across different morph types (Schurgin et al., 2020). Specifically, distributions of memory errors from one morph wheel were predicted by the model’s similarity structure from a different morph wheel (Brady & Robinson, 2023). Furthermore, the model can be used to generate novel exemplars and states of real-world objects with text prompts. Exemplars capture variability within a category (e.g., different kinds of chairs), while states capture dynamic transformations of an object (e.g., the same chair rotated in different ways). Critically, people’s memory confusions replicated well-documented differences in memory for exemplar versus state stimuli. Finally, the model can be used to generate exemplar and state wheels with a controlled similarity structure that allows parameter free predictions of memory errors from exemplars to states. These findings highlight the versatility of the AI model, and we discuss how it provides a powerful experimental tool and theoretical framework for probing the structure of human cognition across visual domains.