Category learning in deep neural networks: Information content and geometry of internal representations

📅 2025-10-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates how the informational content and geometric structure of internal representations in deep neural networks influence classification performance during category learning. Method: We propose a representation-learning framework guided by the class-conditional Fisher information matrix: (i) it deforms the representation space to align the neural Fisher information with class boundaries—peaking and orienting it along them; and (ii) it maximizes mutual information between class labels and pre-decision-layer neural activity to achieve optimal perceptual sensitivity. Contribution/Results: Theoretical analysis, validated on MNIST and controlled toy models, demonstrates that this mechanism significantly enhances discriminability of stimuli near decision boundaries. Moreover, it provides the first information-geometric unification of classic phenomena such as categorical perception enhancement—showing that optimal category learning corresponds to matching and directional alignment between neural Fisher information and class Fisher information.

Technology Category

Application Category

📝 Abstract
In animals, category learning enhances discrimination between stimuli close to the category boundary. This phenomenon, called categorical perception, was also empirically observed in artificial neural networks trained on classification tasks. In previous modeling works based on neuroscience data, we show that this expansion/compression is a necessary outcome of efficient learning. Here we extend our theoretical framework to artificial networks. We show that minimizing the Bayes cost (mean of the cross-entropy loss) implies maximizing the mutual information between the set of categories and the neural activities prior to the decision layer. Considering structured data with an underlying feature space of small dimension, we show that maximizing the mutual information implies (i) finding an appropriate projection space, and, (ii) building a neural representation with the appropriate metric. The latter is based on a Fisher information matrix measuring the sensitivity of the neural activity to changes in the projection space. Optimal learning makes this neural Fisher information follow a category-specific Fisher information, measuring the sensitivity of the category membership. Category learning thus induces an expansion of neural space near decision boundaries. We characterize the properties of the categorical Fisher information, showing that its eigenvectors give the most discriminant directions at each point of the projection space. We find that, unexpectedly, its maxima are in general not exactly at, but near, the class boundaries. Considering toy models and the MNIST dataset, we numerically illustrate how after learning the two Fisher information matrices match, and essentially align with the category boundaries. Finally, we relate our approach to the Information Bottleneck one, and we exhibit a bias-variance decomposition of the Bayes cost, of interest on its own.
Problem

Research questions and friction points this paper is trying to address.

Optimizing neural network geometry to enhance categorical discrimination near decision boundaries
Maximizing mutual information between categories and neural representations for efficient learning
Aligning neural Fisher information with category boundaries to improve classification accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Maximizing mutual information for category learning
Building neural representation with appropriate metric
Expanding neural space near decision boundaries
🔎 Similar Papers
No similar papers found.
L
Laurent Bonnasse-Gahot
Centre d’Analyse et de Mathématique Sociales (CAMS), EHESS, CNRS, École des Hautes Études en Sciences Sociales, 54 bd. Raspail, 75006 Paris, France
Jean-Pierre Nadal
Jean-Pierre Nadal
CNRS - LPENS, ENS, and CAMS, EHESS, Paris, France
computational neurosciencecomplex systems in social sciences