From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

📅 2023-10-17
📈 Citations: 2
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
This work investigates the intrinsic mechanisms underlying generalization performance disparities among diverse deep neural network (DNN) architectures trained on the same dataset, with a focus on quantifying nonlinear capacity. To this end, we propose *Nonlinear Signatures*—the first theoretically rigorous, closed-form computable metric for DNN nonlinearity, grounded in affine optimal transport mappings. Unlike conventional approaches that rely solely on architectural depth or width to estimate expressive power, our framework models the nonlinear evolution of feature representations via Wasserstein distance and analytically tractable transport maps. Empirical evaluation across representative vision models—including AlexNet, ResNet, and Vision Transformers—demonstrates strong correlation (Pearson’s *r* > 0.92) between signature values and generalization accuracy. The method enables architecture-agnostic nonlinear analysis and enhances model interpretability. Implementation is publicly available to support reproducible research and cross-architectural comparison.
📝 Abstract
In the last decade, we have witnessed the introduction of several novel deep neural network (DNN) architectures exhibiting ever-increasing performance across diverse tasks. Explaining the upward trend of their performance, however, remains difficult as different DNN architectures of comparable depth and width -- common factors associated with their expressive power -- may exhibit a drastically different performance even when trained on the same dataset. In this paper, we introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for approximately measuring the non-linearity of deep neural networks. Built upon a score derived from closed-form optimal transport mappings, this signature provides a better understanding of the inner workings of a wide range of DNN architectures and learning paradigms, with a particular emphasis on the computer vision task. We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature and its potential for long-reaching implications. The code for our work is available at https://github.com/qbouniot/AffScoreDeep
Problem

Research questions and friction points this paper is trying to address.

Measure non-linearity in deep neural networks
Compare performance of different DNN architectures
Understand inner workings of DNNs in computer vision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing non-linearity signature for DNNs
Using optimal transport mappings for measurement
Focusing on computer vision task analysis
🔎 Similar Papers
Quentin Bouniot
Quentin Bouniot
Postdoc at TUM and Helmholtz Munich
deep learningrepresentation learningexplainabilityuncertaintylearning with limited labels
I
I. Redko
Noah’s Ark Lab, Paris
A
Anton Mallasto
Smartly.io
C
Charlotte Laclau
LTCI, Télécom Paris, Institut Polytechnique de Paris, France
K
Karol Arndt
Aalto University, Finland, Intelligent Robotics Group
O
Oliver Struckmeier
Aalto University, Finland, Intelligent Robotics Group
Markus Heinonen
Markus Heinonen
Academy research Fellow, Aalto University
Bayesian deep learningdynamical systemsgenerative models
V
V. Kyrki
Aalto University, Finland, Department of Computer Science
Samuel Kaski
Samuel Kaski
Director, ELLIS Institute Finland; Professor, Aalto University and University of Manchester
Probabilistic machine learningAI4ScienceCollaborative AI