Targeted perturbations reveal brain-like local coding axes in robustified, but not standard, ANN-based brain models

📅 2025-09-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

State-of-the-art artificial neural network (ANN) brain models achieving high predictive accuracy suffer from local representational geometric fragility—exhibiting extreme sensitivity to minute adversarial perturbations, rendering local encoding directions unreliable and inconsistent with the structured selectivity observed in biological visual systems. Method: We introduce “local representational geometry” as a novel evaluation framework, employing adversarial probing, neural response modeling, and cross-architecture comparison of local encoding directions to systematically assess diverse ANN brain models. Contribution/Results: Standard models exhibit unstable, semantically ambiguous local coding; in contrast, adversarially robust models demonstrate stable, transferable, and semantically interpretable local encoding axes—better aligning with human neural selectivity and generating experimentally testable predictions. This work establishes a geometric perspective as a new paradigm for evaluating and refining computational brain models.

Technology Category

Application Category

📝 Abstract

Artificial neural networks (ANNs) have become the de facto standard for modeling the human visual system, primarily due to their success in predicting neural responses. However, with many models now achieving similar predictive accuracy, we need a stronger criterion. Here, we use small-scale adversarial probes to characterize the local representational geometry of many highly predictive ANN-based brain models. We report four key findings. First, we show that most contemporary ANN-based brain models are unexpectedly fragile. Despite high prediction scores, their response predictions are highly sensitive to small, imperceptible perturbations, revealing unreliable local coding directions. Second, we demonstrate that a model's sensitivity to adversarial probes can better discriminate between candidate neural encoding models than prediction accuracy alone. Third, we find that standard models rely on distinct local coding directions that do not transfer across model architectures. Finally, we show that adversarial probes from robustified models produce generalizable and semantically meaningful changes, suggesting that they capture the local coding dimensions of the visual system. Together, our work shows that local representational geometry provides a stronger criterion for brain model evaluation. We also provide empirical grounds for favoring robust models, whose more stable coding axes not only align better with neural selectivity but also generate concrete, testable predictions for future experiments.

Problem

Research questions and friction points this paper is trying to address.

Evaluating ANN-based brain models beyond predictive accuracy

Assessing local representational geometry using adversarial probes

Identifying robust models with brain-like stable coding axes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial probes test local geometry of brain models

Robust models show stable and meaningful coding axes

Local representational geometry improves model evaluation

🔎 Similar Papers

No similar papers found.

Authors to Follow