🤖 AI Summary
This paper identifies a fundamental issue in probabilistic generative modeling: overfitting to the global data distribution leads models to memorize rather than genuinely generate. To address this, we propose the Mutually Exclusive Probability Space (MESP) theory and the Local Correlation Hypothesis (LCH), which— for the first time—formally characterize the optimization conflicts arising from latent variable distribution overlap in VAEs and derive a variational lower bound parameterized by an overlap coefficient. We further design a Binary Latent-code Autoencoder (BL-AE) and a Histogram-output Autoregressive Model (ARVM). Empirical validation confirms LCH: generative capability stems from local structural correlations among latent variables, not global statistical alignment. ARVM achieves state-of-the-art FID scores on standard benchmarks; however, deeper analysis reveals that low FID may obscure memorization artifacts, underscoring the critical role of local modeling in achieving authentic generative capacity.
📝 Abstract
We propose two theoretical frameworks, the Mutually Exclusive Probability Space (MESP) and the Local Correlation Hypothesis (LCH), to explore a potential limitation in probabilistic generative models; namely that learning global distributions leads to memorization rather than generative behavior. MESP emerges from our rethinking of the Variational Autoencoder (VAE). We observe that latent variable distributions in VAE exhibit overlap, which leads to an optimization conflict between the reconstruction loss and KL-divergence loss. A lower bound based on the overlap coefficient is proposed. We refer to this phenomenon as Mutually Exclusive Probability Spaces. Based on MESP, a Binary Latent Autoencoder (BL-AE) is proposed to encode images into binary latent representations. These binary latents are used as the input to our Autoregressive Random Variable Model (ARVM), a modified autoregressive model outputting histograms. Our ARVM achieves competitive FID scores, outperforming state-of-the-art methods on standard datasets. However, such scores reflect memorization rather than generation. To address this issue, we propose the Local Correlation Hypothesis (LCH), which posits that generative capability arising from local correlations among latent variables. Comprehensive experiments and discussions are conducted to validate our frameworks.