Principal distortions for discrimination of image representations
Poster Presentation: Saturday, May 17, 2025, 8:30 am – 12:30 pm, Pavilion
Session: Theory
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Jenelle Feather1,2,3, David Lipshutz1,2,4, Sarah E. Harvey2, Alex H. Williams2,3, Eero P. Simoncelli2,3; 1Equal Contribution, 2Flatiron Institute, 3New York University, 4Baylor College of Medicine
Similarity between image representations is often quantified by measuring their alignment over a set of natural images that span many object categories, viewpoints, or environments. However, systems with comparable representational similarity measures on these sets of natural images can have strikingly different sensitivities to small stimulus distortions. We propose a framework for comparing a set of image representations in terms of their sensitivities to small distortions. We quantify the local geometry of a representation using the Fisher Information matrix (FIM), a standard statistical measure of sensitivity to stimulus perturbations, and use this to define a metric on the local geometry of representations in the vicinity of a base image. This metric may then be used to optimally discriminate a set of representations by defining a pair of “principal distortions” that maximize the variance of the representations under this metric. We use this framework to compare a set of simple models of the early visual system, identifying a pair of image perturbations that allow immediate comparison of the models by visual inspection, and naturally extend to psychophysical experiments measuring the discrimination thresholds for the perturbations. In a second example, we apply our method to a set of deep neural network models and reveal differences in the local geometry that arise due to architecture or training protocol. These examples demonstrate the use of our framework to elucidate informative differences in local sensitivities between complex computational models, and lay the groundwork for comparison of model representations with human perception.