Capturing the representational dynamics of face perception in deep recurrent neural networks
Poster Presentation: Tuesday, May 20, 2025, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Object Recognition: Models
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Hossein Adeli1 (), Chase W. King1, Nikolaus Kriegeskorte1; 1Columbia University
In order to understand the space of potential neural mechanisms of primate visual recognition and how they unfold over time, we investigate the representational dynamics of recurrent convolutional neural networks (RCNNs). We explore a family of models with bottom-up and lateral connections that were optimized for face-identification and object-recognition tasks. Using representational similarity analysis (RSA), we observed that only models that were trained for face identification showed a late-emerging prominent distinction of identities as seen in the monkey face patch AM. Interestingly, early model responses strongly separated the objects from the faces. These findings suggest that the dynamics of face recognition that emerges in a hierarchical recurrent neural network prioritizes category-level recognition at early stages (face detection), triggering later category-specific computations that enable individual-level recognition as observed in neurophysiological findings (face discrimination). Our results also show that models that were trained simultaneously on both face identification and object recognition were more likely to show the signature of mirror symmetric viewpoint tuning in their intermediate representations as has been reported for monkey face patch AL. We then examined the tuning properties of individual units in the last layer of our network across timesteps. After embedding the face/non-face objects in a multi-dimensional representational space, for each unit the tuning axes were determined as the direction in which the unit responses increased (measured separately for face and non-face object clusters). The model exhibited a change of alignment between the face and object axes with increasing steps, resembling a late emerging identity discrimination tuning that was recently observed in primate face patches. Taken together, these results give us a candidate mechanistic account of primate face perception. The model is consistent with evidence on individual unit tuning and population geometry, revealing how the visual system dynamically separates first categories and later identities.
Acknowledgements: Research reported in this publication was supported in part by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health under award number [RF1NS128897].