Modeling Action-Perception Coupling with Reciprocally Connected Neural Fields

Poster Presentation: Saturday, May 17, 2025, 8:30 am – 12:30 pm, Pavilion
Session: Face and Body Perception: Neural

Xinrui Jiang1, Martin A. Giese1; 1CIN & HIH, University Clinics Tübingen

Visual perception is modulated by various contextual factors, including self-motion. The existence of the mirror neuron system highlights a strong connection between action observation and execution. Yet the mechanisms underlying action-perception coupling remain incompletely understood. METHODS: We developed a hierarchical neural model that represents perceived and executed actions by recurrent neural networks (neural fields) which are reciprocally connected to investigate interactions between visual and motor regions. Our model includes four interconnected parts: a visual pathway, neural fields that represent perceived actions and motor plans, and a highly simplified motor pathway. The visual pathway recognizes body shapes based on a pre-trained deep neural network. The neural fields representing visually perceived shape sequences and motor programs include recurrent interactions that result in sequence selectivity or support autonomous traveling pulse solutions. The interaction between these fields and fields representing different types of actions is designed in a way that results in inhibition between different actions, and in mutual excitation between temporally coherent perceived and executed actions. The motor pathway generates image sequences of executed actions from the activation patterns in the motor neural field. We validated the model by exploiting hand movement image sequences, varying the temporal or movement-type congruence between executed and visually perceived actions. RESULTS: In congruent conditions, synchronized visual inputs and action execution enhanced visual and motor neural field activation, consistent with prior psychophysical findings. Temporal incongruent conditions revealed visual neural field inhibition when delays between visual cues and motion exceeded a critical threshold. Notably, our model aligned with recent fMRI findings that presenting congruent visual stimuli during movement suppressed the activation of units encoding incongruent movement patterns. CONCLUSIONS: Our model successfully captured action-perception interaction across varying congruence conditions. The model makes specific predictions regarding excitatory and inhibitory interactions between visual and motor representations of actions dependent on task conditions.

Acknowledgements: The work was funded by ERC 2019-SyG-RELEVANCE-856495. The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting Xinrui Jiang.