Toward Architecture-Agnostic Local Control of Posterior Collapse in VAEs

📅 2025-08-17

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Variational autoencoders (VAEs) often suffer from posterior collapse, degrading generative diversity; existing mitigation strategies rely on regularization trade-offs or architectural constraints, limiting generalizability and controllability. This paper proposes an architecture-agnostic method for localized posterior collapse control: we define a local collapse metric and introduce a latent reconstruction loss (LRL), leveraging the mathematical properties of injection and composition functions to enable end-to-end optimization within the variational inference framework. LRL requires no architectural modifications and jointly preserves reconstruction fidelity and latent identifiability. Experiments on MNIST, FashionMNIST, Omniglot, CelebA, and FFHQ demonstrate that our approach significantly alleviates posterior collapse, markedly improving sample diversity and distribution coverage. The method establishes a more robust and generalizable control paradigm for VAE training.

Technology Category

Application Category

📝 Abstract

Variational autoencoders (VAEs), one of the most widely used generative models, are known to suffer from posterior collapse, a phenomenon that reduces the diversity of generated samples. To avoid posterior collapse, many prior works have tried to control the influence of regularization loss. However, the trade-off between reconstruction and regularization is not satisfactory. For this reason, several methods have been proposed to guarantee latent identifiability, which is the key to avoiding posterior collapse. However, they require structural constraints on the network architecture. For further clarification, we define local posterior collapse to reflect the importance of individual sample points in the data space and to relax the network constraint. Then, we propose Latent Reconstruction(LR) loss, which is inspired by mathematical properties of injective and composite functions, to control posterior collapse without restriction to a specific architecture. We experimentally evaluate our approach, which controls posterior collapse on varied datasets such as MNIST, fashionMNIST, Omniglot, CelebA, and FFHQ.

Problem

Research questions and friction points this paper is trying to address.

Addressing posterior collapse in VAEs to enhance sample diversity

Overcoming architectural constraints for latent identifiability in VAEs

Proposing architecture-agnostic loss to control posterior collapse

Innovation

Methods, ideas, or system contributions that make the work stand out.

Defines local posterior collapse for flexibility

Proposes Latent Reconstruction loss for control

Works without specific architecture constraints

🔎 Similar Papers

A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse