A prism hierarchy of learning regimes in large linear autoencoders

📅 2026-06-03

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the lack of a systematic theoretical characterization of diverse extreme learning regimes in large linear autoencoders. We propose a loss landscape hierarchy based on a triangular prism geometry, which for the first time unifies five fundamental limiting regimes—large-data, small-data, mean-field, narrow-hidden, and free—onto the two-dimensional faces of the prism, thereby establishing a comprehensive theoretical framework. Through gradient flow analysis, formal loss expansions, and asymptotic derivations—complemented by numerical experiments—we derive explicit evolution formulas for both training and population losses in four of these regimes. Theoretical predictions exhibit excellent agreement with empirical results across all examined settings.

📝 Abstract

Theoretical studies of machine learning models commonly consider different limiting regimes in which the learning dynamics of gradient descent becomes theoretically tractable. It is, however, desirable to have a systematically obtained picture of all qualitatively different extreme learning regimes for a particular type of models. In this paper we propose such a picture for large weight-tied linear autoencoders characterized by input and latent dimensions, initialization magnitude, and training set size. This model is nonlinear in the weights and its gradient flow does not have a general theoretical solution. We show that at the level of the formal loss-expansion hierarchy, its extreme regimes are naturally associated with faces of a triangular prism. In particular, there are five basic extreme regimes associated with the 2-faces of the prism: (1) large-data, (2) small-data, (3) mean-field, (4) narrow-latent, and (5) free. For regimes (1,2,3,4), we derive explicit expressions for both train and population limiting loss evolutions under gradient flow, obtaining very good agreement with experimental results.

Problem

Research questions and friction points this paper is trying to address.

linear autoencoders

learning regimes

gradient descent

extreme regimes

theoretical analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

linear autoencoders

learning regimes

gradient flow