A new technique for understanding how theory-based factors combine to control eye movements

Poster Presentation: Saturday, May 17, 2025, 2:45 – 6:45 pm, Banyan Breezeway
Session: Eye Movements: Models, clinical, context

Abe Leite1 (), Gregory J. Zelinsky1; 1Stony Brook University

In recent years, a stream of large-scale modeling work from Deep Gaze I to the Human Attention Transformer has approached ceiling performance in predicting human eye movements during free-viewing and search. In parallel, a more theory-driven set of studies have investigated how much of gaze behavior can be explained by theoretically-relevant factors like visual salience or object recognition uncertainty. With some exceptions, this work has been limited in two ways: (i) it typically compares models incorporating one factor each, and (ii) it typically does not yield probabilistic models that make concrete predictions about behavior. We introduce a novel statistical method to understand how multiple theoretically important factors are integrated to give rise to free-viewing behavior. Based on Kümmerer, Wallis, and Bethge (PNAS 2015)'s technique of "phrasing saliency maps probabilistically", in which a single factor induces a probability map by transforming values using a piecewise-linear monotonically increasing function, our novel "probabilistic signal integration" combines multiple factors using a piecewise-linear function monotonically increasing in all of its inputs. The standard softmax function is then applied to obtain a predictive probability distribution. Our approach allows multiple factors influencing fixation to be integrated, under the strong but meaningful assumption that the probability of fixating a point must increase to some extent whenever any predictive factor increases. No parcellation of fixations is required. Applying this approach to many factors allows us to see how much a model's likelihood drops when each factor is omitted (like nested GLM) in order to assess how much information each factor contributes uniquely to the model. We present an initial test of this approach by replicating the analysis of free-viewing data performed by Chakraborty, Samaras, and Zelinsky (JoV 2022) using our technique and extending it to multi-factor integration.

Acknowledgements: This project is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. 2234683.