Deep Variational Multivariate Information Bottleneck - A Framework for Variational Losses

📅 2023-10-05

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This paper addresses the lack of a unified theoretical framework for variational dimensionality reduction. It proposes a unified Variational Information Bottleneck (VIB) framework that jointly optimizes encoder-based information compression and decoder-based generative fidelity, enabling principled information trade-offs in latent space. Key contributions include: (1) introducing DVSIB and beta-DVCCA—novel methods that extend the multivariate information bottleneck to deep variational settings for the first time; (2) establishing theoretical connections between DSIB and contrastive learning approaches (e.g., Barlow Twins) via mutual information regularization; and (3) proposing symmetric and weighted mutual information regularization to support multi-view representation learning and generative modeling. Evaluated on Noisy MNIST and CIFAR-100, the framework achieves significant improvements in classification accuracy, latent dimension efficiency, and sample efficiency, attaining state-of-the-art or superior performance.

📝 Abstract

Variational dimensionality reduction methods are widely used for their accuracy, generative capabilities, and robustness. We introduce a unifying framework that generalizes both such as traditional and state-of-the-art methods. The framework is based on an interpretation of the multivariate information bottleneck, trading off the information preserved in an encoder graph (defining what to compress) against that in a decoder graph (defining a generative model for data). Using this approach, we rederive existing methods, including the deep variational information bottleneck, variational autoencoders, and deep multiview information bottleneck. We naturally extend the deep variational CCA (DVCCA) family to beta-DVCCA and introduce a new method, the deep variational symmetric information bottleneck (DVSIB). DSIB, the deterministic limit of DVSIB, connects to modern contrastive learning approaches such as Barlow Twins, among others. We evaluate these methods on Noisy MNIST and Noisy CIFAR-100, showing that algorithms better matched to the structure of the problem like DVSIB and beta-DVCCA produce better latent spaces as measured by classification accuracy, dimensionality of the latent variables, sample efficiency, and consistently outperform other approaches under comparable conditions. Additionally, we benchmark against state-of-the-art models, achieving superior or competitive accuracy. Our results demonstrate that this framework can seamlessly incorporate diverse multi-view representation learning algorithms, providing a foundation for designing novel, problem-specific loss functions.

Problem

Research questions and friction points this paper is trying to address.

Unifying framework for variational dimensionality reduction methods

Balancing information in encoder and decoder graphs

Improving latent spaces for multi-view representation learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies traditional and state-of-the-art variational methods

Extends DVCCA to beta-DVCCA and introduces DVSIB

Connects deterministic DSIB to contrastive learning approaches

🔎 Similar Papers

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling