From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

📅 2023-10-17

📈 Citations: 2

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work investigates the intrinsic mechanisms underlying generalization performance disparities among diverse deep neural network (DNN) architectures trained on the same dataset, with a focus on quantifying nonlinear capacity. To this end, we propose *Nonlinear Signatures*—the first theoretically rigorous, closed-form computable metric for DNN nonlinearity, grounded in affine optimal transport mappings. Unlike conventional approaches that rely solely on architectural depth or width to estimate expressive power, our framework models the nonlinear evolution of feature representations via Wasserstein distance and analytically tractable transport maps. Empirical evaluation across representative vision models—including AlexNet, ResNet, and Vision Transformers—demonstrates strong correlation (Pearson’s *r* > 0.92) between signature values and generalization accuracy. The method enables architecture-agnostic nonlinear analysis and enhances model interpretability. Implementation is publicly available to support reproducible research and cross-architectural comparison.

📝 Abstract

In the last decade, we have witnessed the introduction of several novel deep neural network (DNN) architectures exhibiting ever-increasing performance across diverse tasks. Explaining the upward trend of their performance, however, remains difficult as different DNN architectures of comparable depth and width -- common factors associated with their expressive power -- may exhibit a drastically different performance even when trained on the same dataset. In this paper, we introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for approximately measuring the non-linearity of deep neural networks. Built upon a score derived from closed-form optimal transport mappings, this signature provides a better understanding of the inner workings of a wide range of DNN architectures and learning paradigms, with a particular emphasis on the computer vision task. We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature and its potential for long-reaching implications. The code for our work is available at https://github.com/qbouniot/AffScoreDeep

Problem

Research questions and friction points this paper is trying to address.

Measure non-linearity in deep neural networks

Compare performance of different DNN architectures

Understand inner workings of DNNs in computer vision

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing non-linearity signature for DNNs

Using optimal transport mappings for measurement

Focusing on computer vision task analysis

🔎 Similar Papers

LaCoOT: Layer Collapse through Optimal Transport