Dividing Attention Between Vision and Audition in Spatial Localization Tasks

Poster Presentation: Saturday, May 17, 2025, 2:45 – 6:45 pm, Pavilion
Session: Multisensory Processing: Audiovisual integration

Taylor J. Knickel1,2 (), Gordon E. Legge1,2, Ying-zi Xiong3; 1Department of Psychology, University of Minnesota, Minneapolis, MN, United States., 2Center for Applied and Translational Sensory Science, University of Minnesota, Minneapolis, MN, United States., 3Lions Vision Research and Rehabilitation Center, Wilmer Eye Institute, Johns Hopkins University

Spatial localization of environments often involves attending to distinct auditory and visual stimuli simultaneously. How effectively do observers attend to widely separated but simultaneous visual and/or auditory events? If spatial attention processes are separate for vision and audition (modality-specific), a minimal cost would be expected in attending to one visual and one auditory (bimodal) target pair than for two visual or two auditory (unimodal) target pairs. If spatial attention processes are shared for vision and audition (cross-modal), the expected cost for bimodal targets would be larger than unimodal targets. 19 participants with normal vision completed a spatial localization task, verbally reporting the direction (azimuth) of a single visual or auditory target, a pair of unimodal targets or a pair of bimodal targets. Auditory cues were a piano or violin G5 note presented at 60 dB SPL, while visual cues were an open or closed circle subtending 3 degrees, all presented for 500 ms in each trial. Reported error was computed as the absolute difference between the reported target location and the actual target location in degrees. When locating bimodal targets, both visual and auditory performance remain unaffected compared to single target conditions, regardless of spatial separation. When locating two simultaneous visual targets, there was no significant difference in reported error (M = 6.86) compared to bimodal conditions (M = 6.12). However, locating two simultaneous sounds led to significantly larger errors (M = 28.28, p < 0.001), that increased with spatial separation, compared to bimodal conditions (M = 16.07). The results point towards separate attentional processes for vision and audition (modality-specific hypothesis). Minimal cost was found in locating two simultaneous visual and auditory targets that are spatially distinct, regardless of locations and separations. Increasing the difficulty of the bimodal localization tasks may reveal a cross-modal cost in future studies.

Acknowledgements: NEI R00EY030145