Multisensory Processing
Talk Session: Monday, May 19, 2025, 10:45 am – 12:30 pm, Talk Room 2
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Talk 1, 10:45 am
Audiovisual interactions alter the gain of contrast-response functions in human visual cortex
Minsun Park1 (), Sam Ling1; 1Boston University
Humans are inherently multisensory, seamlessly integrating information across senses to create a unified perception. Recent studies have challenged the traditional notion that early sensory areas are dedicated to a specific sensory modality such as vision by demonstrating its susceptibility to another sensory modality such as hearing. However, how cross-modal interactions influence early sensory processing remains unclear. In particular, it is unknown whether cross-modal interaction can modulate sensory gain in early sensory cortex. In this study, using functional magnetic resonance imaging (fMRI), we investigated whether audiovisual interactions (AVI) modulate neural activity in early visual areas (V1-V3) by measuring its effects on the population contrast-response function, a fundamental building block in vision. Participants viewed vertical gratings, which moved either leftwards or rightwards, with parametrically varied contrast intensities (9 contrast levels, 3-96%). While viewing these stimuli, they also heard synchronous binaural auditory motion. The direction of auditory motion was either congruent or incongruent with the direction of visual motion, alongside a stationary sound condition. We measured BOLD responses in V1-V3 to examine whether and how AVI alters the gain of the contrast response. Results showed higher contrast sensitivity when the audiovisual motion direction was congruent compared to incongruent or stationary conditions, the pattern of which was observed consistently across V1-V3. These findings demonstrate that AVI improves visual sensitivity by modulating sensory gain in visual cortex, particularly under the condition of the congruent direction of audiovisual motion. This suggests that interactions between vision and hearing can influence sensory computations in visual cortex, traditionally considered specific to visual processing.
This work was funded by the National Research Foundation of Korea (NRF) Grant RS-2024-00407838 to M. Park and by the National Institutes of Health (NIH) Grant R01EY028163 to S. Ling.
Talk 2, 11:00 am
Auditory spatial cues influence cross-modal recruitment of visual prefrontal cortex
Abigail Noyce1, Wusheng Liang1, Madhumitha Manjunath1, Christopher Brown2, Barbara Shinn-Cunningham1; 1Carnegie Mellon University, 2University of Pittsburgh
Lateral prefrontal cortex (PFC) contains discrete regions that are preferentially recruited for visual versus auditory attention and working memory (WM), including regions in superior and inferior precentral sulcus and in mid inferior frontal sulcus. During auditory WM for spatial locations, these visual-biased regions are also significantly recruited. This recruitment during spatial auditory cognition may indicate that additional resources are required to buttress audition’s poor precision for spatial location (consistent with the multiple-demand account of PFC), or it may reflect task-specific recruitment of vision’s cortical machinery for representing space. In order to better understand the role of visual-biased PFC in auditory spatial cognition, we used fMRI to first label these structures in individual subjects (N = 20) using a direct contrast of visual vs. auditory 2-back WM blocks. Then, in an independent task, we estimated the recruitment of visual-biased PFC during auditory spatial WM under 3 different spatial cue conditions. Auditory locations were cued using either a weaker spatial cue (interaural time differences, ITDs), a moderate spatial cue (interaural level differences, ILDs), or a strong spatial cue (head-related transfer functions, HRTFs). The multiple-demand account predicts that visual PFC recruitment will be strongest under ITDs, because this is the case when the most effort is required to represent spatial location. Instead, we observe that visual PFC recruitment is lowest (but still positive) under ITDs, and is highest under HRTFs, suggesting that visual PFC is representing spatial location rather than allocating resources. These results support a task-specific, not multiple-demand, account of visual-biased PFC’s cognitive role.
Supported in part by ONR MURI N00014-19-12332
Talk 3, 11:15 am
Early visual cortex encodes multisensory postdictive perception with retinotopic specificity: a layer-specific fMRI study
Pieter Barkema1,2 (), Joost Haarsma1, Christoph Koenig1, Peter Kok1; 1Department of Imaging Neuroscience, UCL Queen Square Institute of Neurology, University College London, London, UK, 2Radboud University, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
Postdiction is a phenomenon where later incoming information influences how we perceive earlier sensory input. Little is known about the neural mechanisms of postdiction, despite its important role in shaping perception. Accumulating research suggests that neural representations in the early visual cortex (EVC) are not solely determined by bottom-up retinal inputs, but additionally reflect top-down modulations of subjective perception. Bottom-up and top-down signals in EVC are reflected in different cortical layers. Here, we extend this framework to postdictive perception. Inspired by Kok et al. (2016) we hypothesized that neural responses to postdictive illusions in deep cortical layers would reflect perception, and that the effect is retinotopically specific. We used the Audiovisual Rabbit paradigm (Stiles et al., 2018) to induce a postdictive illusory flash using sound. Twenty-four participants were selected for high susceptibility to the illusion and took part in the experiment as well as in retinotopic mapping during 7T functional Magnetic Resonance Imaging (7T fMRI). EVC neural response amplitude reflected retinal input but not perception, with no layer-specific differences. Multivoxel pattern analysis, however, revealed that the EVC activity pattern evoked by an illusory flash was similar to that evoked by a real flash and this effect was retinotopically specific. This effect was stronger in deep layers than middle layers, in line with a top-down effect. These findings extend the emerging role of EVC in perception by, for the first time to our knowledge, implicating it in multisensory postdiction. Moreover, these findings support the emerging view that the amplitude of neural responses in EVC is primarily driven by retinal input, whereas EVC activity patterns reflect subjective perception.
This work was supported by a Wellcome/Royal Society Sir Henry Dale Fellowship [218535/Z/19/Z] and a European Research Council (ERC) Starting Grant [948548] to P.K. The Wellcome Centre for Human Neuroimaging was supported by core funding from the Wellcome Trust [203147/Z/16/Z].
Talk 4, 11:30 am
Brain representations of numerosity across the senses and presentation format
Ying YANG1 (), Michele Fornaciai1, Irene Togoli1, Iqra Shahzad1, Alice Van Audenhaege1, Filippo Cerpelloni1,2, Olivier Collignon1,3; 1Institute of Psychology (IPSY) and Institute of Neuroscience (IoNS), University of Louvain, Belgium, 2Department of Brain and Cognition, Leuven Brain Institute, KU Leuven, 3The Sense Innovation and Research Center, HES-SO Valais-Walis (Lausanne and Sion)
Whether it’s three pens on a table, three knocks on a door, or three strikes of a hammer on a nail, we can automatically perceive "threeness”. Such ability to seamlessly encode numerosity across the senses and presentation formats has led people to assume the existence of an abstract numerical code in our minds. Yet, researches on the existence of such an abstract representation in the human brain have yielded inconsistent results. The current study used multivariate pattern analysis and representational similarity analysis to comprehensively investigate how the brain represents numerosities (range 2-5) across different modalities (auditory, visual) and formats (sequential, simultaneous; symbolic, non-symbolic). We identify a set of dorsal brain regions, from early visual cortex to the intraparietal and frontal regions, that encode specific non-symbolic numerosity across formats and modalities. The numerical distance effect, a hallmark of magnitude encoding, was observed in parietal regions. We also observed aligned representation of numerosities across visual and auditory modalities in the intraparietal and frontal subregions, but only when they shared a sequential presentation format. Further exploration of the unique contributions of modality and format factors to numerosity representation revealed that the distributed numerical activity in lateral intraparietal sulcus (IPS) subregions is mostly influenced by the modality of presentation, the anterior IPS by the format of presentation, while the medial IPS exhibited equal contributions from both factors. Our study provides a detailed description of the geometry of numerosity representation across the senses and formats in the human brain and provides support for abstract numerical representation across the senses when the presentation format is equivalent.
Talk 5, 11:45 am
Multisensory Continuous Psychophysics: Heading Perception is Faster but Not More Precise When Both Sound and Visual Cues are Present
Bjoern Joerges1, Jong-Jin Kim1, Laurence Harris1; 1York University, Toronto
Heading perception is an inherently multisensory phenomenon that can involve, among others, visual and auditory cues. Like in many other tasks, the presence of multisensory over unisensory cues is expected to lead to both higher precision in responses and lower reaction times – findings that are fairly well established in the trial-based tasks that are typical of this area of study. Here, we used a novel paradigm from vision science - continuous psychophysics - to investigate whether such enhancements of multisensory heading perception were found. We immersed 25 participants in a virtual environment in which they either experienced unisensory visual or auditory information consistent with self-motion that continuously changed direction, or consistent visual and auditory information at the same time. They were asked to continuously align a joystick with their direction of motion. Contrary to our expectations, we did not find any differences in precision between the three (auditory, visual, and visuo-auditory) conditions. However, we did find that participants reacted faster to changes in the stimulus in the visuo-auditory condition than in either of the unisensory conditions. While this discrepancy between our results and what has generally been reported in the literature (e.g., Ernst & Banks, 2002) might be the consequence of a speed-accuracy trade-off (Drugowitsch et al., 2014), it underlines the importance of testing long-established findings using novel paradigms in new, diversified contexts. References: Drugowitsch et al. (2014) eLife 3, e03005 Ernst and Banks (2002) Nature 415, 429-33
Talk 6, 12:00 pm
Examining the time course of visuoproprioceptive integration using the mirror box illusion
Grant Fairchild1 (), Yuqi Liu2, Riwa Safa3, Jared Medina1,3; 1Emory University, 2Chinese Academy of Sciences, 3University of Delaware
The mirror box illusion creates a visuoproprioceptive conflict between the actual position of the hand behind the mirror and the reflection of the participant’s viewed hand. After synchronous movement or even passive viewing of the hand over time, participants will report feeling their hand where they see it. Although the illusion involves evidence accumulation over time, the relationship between evidence accumulation and perceptual qualia is poorly understood. One possible mechanism is an abrupt state transition in which the visual estimate of hand position overrides the proprioceptive estimate as soon as the accumulated evidence favoring vision surpasses a certain threshold. Alternatively, the illusion may reflect a gradual shift in weighting given to the visual vs. proprioceptive estimates as evidence favoring vision accumulates. To investigate the temporal progression of the mirror box illusion, we recorded observers’ perceived hand position every five seconds in a series of experiments manipulating several factors that modulate the congruence between visual, motor, and proprioceptive signals. As expected, we find that the illusion is strengthened by factors increasing the congruence between the hidden and reflection hand, including synchronous movements, decreased scalar or angular distance, and reduced biomechanical constraints. Notably, we find that the illusion can proceed along both abrupt and gradual trajectories: Sometimes, participants perceive an abrupt transition in the perceived position of their hand, as though their hand suddenly snapped into the position of the reflection, while other times, participants perceive a gradual shift in hand position, as though their hand slowly drifted towards the position of the reflection. Furthermore, there appear to be consistent individual differences in observers’ tendency to experience the illusion abruptly or gradually. We discuss the implications of these findings for the mechanisms of resolution of visual-proprioceptive conflict.
This project was funded by NSF grant 1632849.