Understanding Generalization, Robustness, and Interpretability in Low-Capacity Neural Networks

📅 2025-07-22

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This study investigates the intrinsic relationships among generalization, robustness, and interpretability in low-capacity neural networks, focusing on the trade-offs among model capacity, sparsity, and task complexity. Method: We construct a progressively challenging MNIST binary classification benchmark and apply extreme magnitude pruning (up to 95% sparsity), saliency map analysis, and controlled ablation experiments to systematically characterize the performance limits and inference consistency of sparse subnetworks. Contribution/Results: (1) High-performing sparse subnetworks exist, with their minimal effective capacity scaling linearly with task complexity; (2) over-parameterization markedly enhances input robustness, and pruning preserves the original model’s core inference pathways; (3) strong generalization and robustness persist even at high sparsity levels, empirically validating the feasibility of “small yet powerful” network design. These findings advance principled understanding of sparse neural architectures and inform efficient, reliable, and interpretable deep learning.

Technology Category

Application Category

📝 Abstract

Although modern deep learning often relies on massive over-parameterized models, the fundamental interplay between capacity, sparsity, and robustness in low-capacity networks remains a vital area of study. We introduce a controlled framework to investigate these properties by creating a suite of binary classification tasks from the MNIST dataset with increasing visual difficulty (e.g., 0 and 1 vs. 4 and 9). Our experiments reveal three core findings. First, the minimum model capacity required for successful generalization scales directly with task complexity. Second, these trained networks are robust to extreme magnitude pruning (up to 95% sparsity), revealing the existence of sparse, high-performing subnetworks. Third, we show that over-parameterization provides a significant advantage in robustness against input corruption. Interpretability analysis via saliency maps further confirms that these identified sparse subnetworks preserve the core reasoning process of the original dense models. This work provides a clear, empirical demonstration of the foundational trade-offs governing simple neural networks.

Problem

Research questions and friction points this paper is trying to address.

Investigates generalization in low-capacity neural networks

Explores robustness to extreme pruning and input corruption

Analyzes interpretability via saliency maps in sparse subnetworks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Controlled framework for binary classification tasks

Extreme magnitude pruning up to 95% sparsity

Saliency maps for interpretability analysis

🔎 Similar Papers

No similar papers found.