Humans versus machines: Distinguishing Korean, Chinese, and Japanese faces via internal and external features

Poster Presentation: Monday, May 19, 2025, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Face and Body Perception: Parts and wholes

Cansu Malak1, Christian Wallraven1; 1Korea University

Deep learning algorithms have shown super-human performance for face identification for a number of years now, raising the question of to what degree these algorithms process faces similarly to humans. In order to dive deeper into comparing human versus machine face processing, here we present results from a fine-grained ethnicity categorization task, in which the goal was to distinguish three different categories of East Asian faces (Chinese, Japanese, and Korean) using internal and external facial features. For the human experiment, we showed participants 600 grayscale images of male soccer players either as cropped faces (Experiment 1, internal features only, N=53) or as full-face images (Experiment 2, including external features like hair and face outline, N=52) in a three-alternative-forced-choice task. For cropped faces, the performance was 40.08% on average, which improved to 52.38% for full faces (p<.001), showing how difficult the task was for humans. To compare with deep learning algorithms, we used the DeepFace library to extract embeddings from 10 state-of-the-art face recognition models, using a support vector machine classifier to predict ethnicity via 10-fold cross-validation. The average validation set performance for the models was significantly higher than human performance at 74.88% (p<.001) - with no differences across algorithms or image types. Interestingly, we observed that Japanese faces were easier to categorize for both humans and machines. Conversely, an item-based analysis showed only weak concurrence for accuracy (all r2<.08; p<.001) between humans and machines. Overall, our results show that internal features suffice for ethnicity categorization for deep learning algorithms, whereas humans require external information (from hairstyle and/or face outline) for this task. Whereas there are some shared performance patterns, the deep learning algorithms seem to process faces differently from humans in this task.

Acknowledgements: This study was supported by the National Research Foundation of Korea (BK21 FOUR, NRF-2022R1A2C2092118, NRF-2022R1H1A2092007) and by IITP grants funded by the Korea government (No. RS-2019-II190079, Dept. of AI, Korea University; No. RS-2021-II212068, AI Innovation Hub).