Dynamic Object Processing in Macaque IT Cortex: Temporal Dynamics and Model Limitations
Poster Presentation: Monday, May 19, 2025, 8:30 am – 12:30 pm, Pavilion
Session: Object Recognition: Neural mechanisms
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Matteo Dunnhofer1,2 (), Christian Micheloni2, Kohitij Kar1; 1York University, 2University of Udine
The macaque inferior temporal (IT) cortex, the apex of the ventral visual stream, plays a crucial role in object recognition. While most studies have relied on static images to predict neural responses and build encoding models, how dynamic visual inputs are transformed into IT responses remains incompletely understood. Here, we presented 200 videos (500 ms each, 18 frames at 60 Hz) to two macaques while recording neural activity from 132 reliable sites in the IT cortex. Each video contained objects moving within a fixed background, enabling us to investigate the temporal dynamics of IT population responses. We addressed three key questions. First, we asked whether IT neurons exhibit reliable activity beyond the initial transient response (70–150 ms) in response to videos. Neural responses demonstrated significant reliability throughout the video duration (~0.62 Spearman-R split-half correlation), suggesting that dynamic stimuli engage sustained processing in the IT cortex. Second, we assessed whether these later responses could predict object identity using linear decoders. Decoding performance was significantly above chance (~58% classification accuracy ~500ms post video onset, chance-level=10%), indicating that IT activity carries discriminative information for object recognition beyond the initial response phase. Third, we evaluated how well feature activations from feedforward models (e.g., convolutional neural networks) could explain IT responses along its entire reliable dynamics. The early (70-170ms) responses were significantly better predicted (~51% explained variance) by any frame-based model activation compared to later response phases (470-570 ms, %EV ~19%), highlighting a critical limitation of feedforward architectures in accounting for dynamic neural processing. Our findings reveal that dynamic stimuli elicit sustained and informative responses in the IT cortex. The inability of standard feedforward models to explain later neural responses suggests the need for models incorporating recurrent or temporal mechanisms to explain IT representations better.
Acknowledgements: MD was funded by the European Union (MSCA Project 101151834 - PRINNEVOT). KK was supported by Canada Research Chair Program, Simons Foundation Autism Research Initiative (SFARI, 967073), and a NSERC DG.