๐ค AI Summary
This paper addresses the maximum-likelihood training challenge for latent-variable energy-based models (LVEBMs). We propose a manifold-based saddle-point optimization framework grounded in particle dynamics. Specifically, training is reformulated as a saddle-point problem over a latent manifold, where the joint distribution and latent variable distribution are jointly optimized. Coupled Wasserstein gradient flows alternately update joint negative samples and latent particles, numerically simulated via overdamped Langevin dynamics and stochastic parameter ascentโwithout requiring discriminators or auxiliary networks. Theoretically, our derived evidence lower bound (ELBO) strictly improves upon standard variational inference and enjoys guaranteed convergence. Empirically, the method achieves high-fidelity generation and strong disentanglement of latent variables on tasks such as physical system modeling, matching or surpassing state-of-the-art approaches in performance.
๐ Abstract
Latent-variable energy-based models (LVEBMs) assign a single normalized energy to joint pairs of observed data and latent variables, offering expressive generative modeling while capturing hidden structure. We recast maximum-likelihood training as a saddle problem over distributions on the latent and joint manifolds and view the inner updates as coupled Wasserstein gradient flows. The resulting algorithm alternates overdamped Langevin updates for a joint negative pool and for conditional latent particles with stochastic parameter ascent, requiring no discriminator or auxiliary networks. We prove existence and convergence under standard smoothness and dissipativity assumptions, with decay rates in KL divergence and Wasserstein-2 distance. The saddle-point view further yields an ELBO strictly tighter than bounds obtained with restricted amortized posteriors. Our method is evaluated on numerical approximations of physical systems and performs competitively against comparable approaches.