Beyond Linear and Overcomplete Regimes: A Mean-Field Analysis of Bottleneck Autoencoders

📅 2026-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of analytical understanding of training dynamics in nonlinear bottleneck autoencoders under finite-dimensional, non-overcomplete settings. It introduces, for the first time in this regime, a mean-field approach to derive explicit learning dynamics for both encoder and decoder networks, and establishes a high-probability convergence guarantee between finite-width networks and their mean-field limit. The theoretical analysis demonstrates that, within a finite time horizon, the empirical risk trajectory closely approximates the mean-field risk with high probability and converges to its optimum. This study provides the first tractable characterization of mean-field training dynamics for nonlinear finite bottleneck autoencoders, offering new theoretical insights into their optimization behavior.
📝 Abstract
Autoencoders (AEs) learn low-dimensional representations by mapping data into a latent space while minimizing reconstruction error. Despite their empirical success, theoretical understanding remains limited and largely restricted to linear models or settings without a bottleneck. In this work, we study nonlinear AEs with a fixed finite-dimensional bottleneck in the mean-field (MF) regime. We derive explicit MF learning dynamics for both encoder and decoder, providing a tractable characterization of training in the nonlinear setting. We show that, over finite time horizons, the empirical risk of finite-width networks trained with stochastic gradient descent closely tracks the MF risk trajectory with high probability. At optimality, we further establish that the finite-width risk converges to the MF optimum, demonstrating that finite networks are sufficiently expressive to approximate the infinite-width solution.
Problem

Research questions and friction points this paper is trying to address.

autoencoders
bottleneck
nonlinear
mean-field
theoretical analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

bottleneck autoencoders
mean-field analysis
nonlinear autoencoders
finite-width networks
stochastic gradient descent
🔎 Similar Papers
2024-03-07arXiv.orgCitations: 2
S
Santanu Das
STCS department, Tata Institute Of Fundamental Research, Mumbai, India-400005
R
Ramyak Bilas
Department of Mathematics, Indiana University Bloomington, Indiana, USA-47405
P
Pascal Esser
Department of Mathematics, Ludwig-Maximilians-Universität München, Germany, München-80799
Satyaki Mukherjee
Satyaki Mukherjee
National University of Singapore
ProbabilityRandom Matrix TheoryMachine learning