Using transfer learning to identify a neural network’s algorithm
Poster Presentation: Saturday, May 17, 2025, 8:30 am – 12:30 pm, Pavilion
Session: Theory
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
John Morrison1,2, Nikolaus Kriegeskorte22,3,4, Benjamin Peters5,6; 1Philosophy Department, Barnard College, Columbia University, 2Zuckerman Mind Brain Behavior Institute, Columbia University, 3Department of Psychology, Columbia University, 4Department of Neuroscience, Columbia University, 5School of Informatics, University of Edinburgh, 6School of Psychology & Neuroscience, University of Glasgow
Algorithms generate input-output mappings through step-by-step operations on representations. In vision science, algorithms explain biological and artificial processes. For example, feature weighting explains image categorization, sequential sampling explains visual search, and Bayesian inference explains cue combination. The standard parts-based approach is to look for parts in the underlying network corresponding to the parts of the algorithm. But we have not been able to find many such parts, perhaps because they are too entangled. We propose the alternative approach of identifying a system's algorithm by assessing how quickly it learns alternative input-output mappings, that is, its transfer learning profile. We use artificial networks to demonstrate that this approach is promising. In our first toy experiment, we used transfer learning to show that networks can use different algorithms to compute the same arithmetic function. Importantly, the distinction between these two types of network was not evident in traditional parts-based analyses such as encoding and decoding analysis, weight similarity analysis, and representational similarity analysis. In our second experiment, we used transfer learning to show that convolutional networks use different algorithms to classify images of ellipses. The ellipses were generated from three latent variables: area, color, and circularity. The networks were trained to classify ellipses into two classes defined by a linear function of their latent variables. We could identify algorithms of two types. The first used independent feature detectors (one for area and one for circularity) and the second used joint feature detectors (one for both area and circularity). Once again, the distinct algorithms were not evident in traditional parts-based analyses. In an ongoing experiment, we are using transfer learning to show that convolutional networks use different algorithms to classify objects in naturalistic images with more complex latent features. Combined, our results suggest that transfer learning is a promising alternative to parts-based approaches.