Targeted perturbations reveal brain-like local coding axes in robustified, but not standard, ANN-based brain models

Sep-30-2025–arXiv.org Artificial Intelligence

Artificial neural networks (ANNs) have become the de facto standard for modeling the human visual system, primarily due to their success in predicting neural responses. However, with many models now achieving similar predictive accuracy, we need a stronger criterion. Here, we use small-scale adversarial probes to characterize the local representational geometry of many highly predictive ANN-based brain models. We report four key findings. First, we show that most contemporary ANN-based brain models are unexpectedly fragile. Despite high prediction scores, their response predictions are highly sensitive to small, imperceptible perturbations, revealing unreliable local coding directions. Second, we demonstrate that a model's sensitivity to adversarial probes can better discriminate between candidate neural encoding models than prediction accuracy alone. Third, we find that standard models rely on distinct local coding directions that do not transfer across model architectures. Finally, we show that adversarial probes from robusti-fied models produce generalizable and semantically meaningful changes, suggesting that they capture the local coding dimensions of the visual system. Together, our work shows that local representational geometry provides a stronger criterion for brain model evaluation. We also provide empirical grounds for favoring robust models, whose more stable coding axes not only align better with neural selectivity but also generate concrete, testable predictions for future experiments. For over a decade, NeuroAI has celebrated artificial neural networks (ANNs) for how well they predict brain responses (Y amins et al., 2014; Kriegeskorte, 2015; Storrs et al., 2021; Zhuang et al., 2021; Doerig et al., 2023). However, the field now faces a new challenge: a diverse array of ANN models predict data equally well, making it nearly impossible to distinguish between them using accuracy alone (Schrimpf et al., 2018; Conwell et al., 2023; Linsley et al., 2023; Ratan Murty et al., 2021).

artificial intelligence, machine learning, sensitivity, (18 more...)

arXiv.org Artificial Intelligence

Sep-30-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Health & Medicine
  - Therapeutic Area > Neurology (1.00)
  - Health Care Technology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Neuroscience (1.00)
  - Machine Learning
    - Performance Analysis > Accuracy (0.49)
    - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found