Beyond Linear and Overcomplete Regimes: A Mean-Field Analysis of Bottleneck Autoencoders

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the lack of analytical understanding of training dynamics in nonlinear bottleneck autoencoders under finite-dimensional, non-overcomplete settings. It introduces, for the first time in this regime, a mean-field approach to derive explicit learning dynamics for both encoder and decoder networks, and establishes a high-probability convergence guarantee between finite-width networks and their mean-field limit. The theoretical analysis demonstrates that, within a finite time horizon, the empirical risk trajectory closely approximates the mean-field risk with high probability and converges to its optimum. This study provides the first tractable characterization of mean-field training dynamics for nonlinear finite bottleneck autoencoders, offering new theoretical insights into their optimization behavior.

📝 Abstract

Autoencoders (AEs) learn low-dimensional representations by mapping data into a latent space while minimizing reconstruction error. Despite their empirical success, theoretical understanding remains limited and largely restricted to linear models or settings without a bottleneck. In this work, we study nonlinear AEs with a fixed finite-dimensional bottleneck in the mean-field (MF) regime. We derive explicit MF learning dynamics for both encoder and decoder, providing a tractable characterization of training in the nonlinear setting. We show that, over finite time horizons, the empirical risk of finite-width networks trained with stochastic gradient descent closely tracks the MF risk trajectory with high probability. At optimality, we further establish that the finite-width risk converges to the MF optimum, demonstrating that finite networks are sufficiently expressive to approximate the infinite-width solution.

Problem

Research questions and friction points this paper is trying to address.

autoencoders

bottleneck

nonlinear

mean-field

theoretical analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

bottleneck autoencoders

mean-field analysis

nonlinear autoencoders