đ€ AI Summary
This work investigates the dynamic evolution of feature encoding during Restricted Boltzmann Machine (RBM) training. Methodologically, it combines singular value decomposition to track weight matrix evolution, analytical mean-field analysis, numerical training experiments, and finite-size scaling theory. The key contribution is the first identificationâwithin energy-based generative models (EBMs)âof a mean-fieldâlike paramagneticâferromagnetic cascade phase transition sequence during training. Empirically, the transitions sharpen in the high-dimensional limit; the first-order transition belongs to the mean-field universality class; and learning proceeds sequentially: the model first captures the centroid of the empirical data distribution, then progressively extracts principal components. This study establishes a rigorous correspondence between high-dimensional learning dynamics and statistical-physics phase-transition theory, offering a novel paradigm for understanding the intrinsic learning mechanisms of deep generative models.
đ Abstract
In this paper, we investigate the feature encoding process in a prototypical energy-based generative model, the Restricted Boltzmann Machine (RBM). We start with an analytical investigation using simplified architectures and data structures, and end with numerical analysis of real trainings on real datasets. Our study tracks the evolution of the model's weight matrix through its singular value decomposition, revealing a series of phase transitions associated to a progressive learning of the principal modes of the empirical probability distribution. The model first learns the center of mass of the modes and then progressively resolve all modes through a cascade of phase transitions. We first describe this process analytically in a controlled setup that allows us to study analytically the training dynamics. We then validate our theoretical results by training the Bernoulli-Bernoulli RBM on real data sets. By using data sets of increasing dimension, we show that learning indeed leads to sharp phase transitions in the high-dimensional limit. Moreover, we propose and test a mean-field finite-size scaling hypothesis. This shows that the first phase transition is in the same universality class of the one we studied analytically, and which is reminiscent of the mean-field paramagnetic-to-ferromagnetic phase transition.