From salience to meaning and scene grammar: Predicting visual search efficiency in naturalistic scenes

Poster Presentation: Sunday, May 18, 2025, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Visual Search: Eye movements, scenes, real-world stimuli

Antje Nuthmann1, Anton Janser1; 1Kiel University

Efficiently locating objects within visual scenes is crucial for everyday behavior and involves eye movements to direct our attention. We report results from a large-scale project employing a quasi-experimental approach to investigate scene guidance during visual search in naturalistic environments. Using 170 real-world scene images, each featuring a single target object, we examined how naturally varying object and scene properties influence temporal measures of search efficiency. Object-based predictors included the object’s distance from the center of the scene, its size, and its visual salience, derived from saliency map computations. To gauge the spatial distribution of meaning within the scenes, a meaning map was generated for each scene image, aggregating crowd-sourced responses. Object meaning was represented by the mean value over the search object, while the mean value across all pixels in the map served as a proxy of the information density in the scene. The object’s relationship to the scene was captured through human ratings of semantic fit, reflecting the likelihood of encountering the object in the scene, and syntactic fit, reflecting its positional plausibility. Central to the study was an eye-tracking experiment in which over 50 observers located the target object in each scene by directing their gaze to it. The main dependent variable was the latency to first fixation on the target, measuring how efficiently attention was guided toward it. Linear mixed-model analyses revealed independent effects of object size, object meaning, and syntax, indicating shorter latencies for larger targets, higher-meaning targets, and targets appearing at more plausible locations. Moreover, latencies increased as the information density increased. First-pass gaze duration for the target object showed opposing influences of object salience and meaning: it was longer for more salient objects but shorter for those with higher meaning. We discuss our findings in the context of existing research using experimental manipulations.