Detect AI-generated faces in a glance? Effect of stimulus durations on deepfake detection and eye movements
Poster Presentation: Sunday, May 18, 2025, 2:45 – 6:45 pm, Banyan Breezeway
Session: Face and Body Perception: Body
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Zihao ZHAO1 (), Mingzhi LYU2, Adams Wai Kin KONG2, Hong XU1; 1School of Social Sciences, Nanyang Technological University, 2College of Computing and Data Science, Nanyang Technological University
Artificial intelligence (AI) advancements have led to hyper-realistic facial images. Recent studies found that humans could not tell deepfakes from real faces. However, whether increasing viewing duration can improve deepfake detection is unclear. This study investigated the effects of viewing durations and deepfake algorithms on detection accuracy and eye movement patterns. Thirty-eight participants viewed 112 facial images (6 types: Stable Diffusion (SD 1.5 & SD XL 1.0) generated, Style Generative Adversarial Network (StyleGAN) generated, face-swapped images, and face images from Deepfake Detection Challenge (DFDC), Karolinska Directed Emotional Faces (KDEF), and Tsinghua Facial Expression datasets), judging each as fake or real. Images were presented at 6 viewing durations (16.67, 33.33, 50, 100, 500, 1000 ms) across 6 blocks in random order, with eye movements recorded by EyeLink 1000 Plus. One participant’s data was excluded from analysis for extremely low accuracy. Generalized Linear Mixed Models showed that image type and viewing duration significantly influenced detection accuracy (p’s < 0.001 for main and interaction effects). SD-generated and real images were easily detected (accuracy > 83%), even at 16.67 ms, while StyleGAN and face-swapped images were difficult to detect (accuracy ~7% and 20%, respectively). For face-swapped images, accuracy improved with longer durations, suggesting extended exposure aids in identifying subtle artifacts. StyleGAN images did not improve with longer viewing times; accuracy even decreased. Eye movement analysis showed that real and face-swapped images from DFDC elicited fewer fixations and saccades with viewing durations of 500 and 1000 ms, whereas SD-generated images elicited more saccades and higher saccade amplitudes. Area of Interest analysis revealed predominant fixation on the nose region across all image types. Our results found that deepfake algorithms and viewing duration significantly affect deepfake detection and eye movement patterns. It sheds light on our understanding of face perception mechanisms in the age of AI.