Visual and Semantic Scene Information: A Steady-State Visual Evoked Potentials Study

Poster Presentation: Tuesday, May 20, 2025, 2:45 – 6:45 pm, Pavilion
Session: Scene Perception: Categorization, memory, clinical, intuitive physics, models

Skylar Stadhard1, Sage Aronson1, Michelle Greene1; 1Barnard College

It is understood that scenes can be processed rapidly. What is less understood is why some scenes are processed more quickly than others. Might there be information capacity limits to visual perception? We introduce methods for quantifying relative visual and semantic scene information, and assess the extent to which they contribute to the speed of scene recognition. We employed a 2x2 design to compare the temporal processing of images with high and low visual and semantic information. Images were selected from the Places in the Wild dataset, which contains over 67,000 RAW photographs spanning 260 scene categories. We collected descriptions from 10 participants for each image, and we used natural language processing (NLP) techniques to compute five features, using the first principal component to create a unified semantic information metric. Visual information was computed by assessing the relative file sizes of RAW and compressed PNG image versions. We reasoned that more compressible images had more redundancy, thus less visual information. Unexpectedly, we found that semantic information and visual information scores were largely uncorrelated (r=0.05). For our experiment, we used a sweeping steady-state evoked potential paradigm, with images flashing at 3 Hz. The phase coherence of each image increased over time, from 0% to 100%, in increments of 5% per second. We can assess the extent to which images were objectively recognized through examining the EEG power at 3 Hz over time. Further, by filtering data at 3 Hz, we can compute the mutual information between ERPs and images, using only stimulus-driven responses. By quantifying visual and semantic information, we have provided a method for testing two potential bottlenecks in the visual scene processing.

Acknowledgements: Supported by CAREER 2240815 to MRG