A modular soft barrier model combines uncertainty and reward to predict human eye movements in a two-task driving environment
51.11, Tuesday, May 14, 8:15 - 9:45 am, Royal Ballroom 1-3
Leif Johnson1, Brian Sullivan2,3, Mary Hayhoe2, Dana Ballard1; 1Computer Science Department, UT Austin, 2Center for Perceptual Systems, UT Austin, 3Smith-Kettlewell Eye Institute
We present a control model of visual attention in driving that uses explicit representations of task goals, reward and task uncertainty to predict arbitration of eye movements among multiple tasks in a naturalistic environment, modeled on human data from Sullivan et al. (in press). Their experiment manipulated implicit reward and uncertainty in a two-task virtual driving environment while measuring gaze behavior and task accuracy. Subjects drove through a virtual city and were explicitly instructed to drive at an exact speed while simultaneously follow a leader car. Implicit relative reward was manipulated by additional instructions that emphasized the importance of either the exact-speed or the following-the-leader task. Uncertainty was manipulated by adding perturbations to the car's speed, and was intended primarily to disrupt the exact-speed task. Indeed, when maintaining an exact speed was emphasized, adding uncertainty increased fixations on the speedometer. However, when following another car was emphasized, adding speed uncertainty had no effect on speedometer fixations. This interaction between reward and uncertainty suggests that subjects allocate gaze to uncertain task-relevant objects only when that task is associated with relatively high reward. We translated the high-level modular design of Sprague et al. (2007) into the dynamic driving world by allocating PID controllers to each driving task and modeling the propagation of error as a random walk. The model allocates eye movements to task-relevant objects using a "soft" barrier that defines a Boltzmann distribution over reward-weighted task uncertainties. This model closely matches the distribution of human gaze over time in the driving data, achieving a mean KL divergence from human data that is significantly lower than that from round-robin and constant-probability baselines. Additionally, the model is able to capture elements of task switching dynamics exhibited by human drivers.