Mutual information and task-relevant latent dimensionality

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of accurately estimating the effective dimensionality—i.e., the task-relevant subspace—of latent representations in predictive tasks. Framing the problem within the information bottleneck principle, the authors seek the minimal embedding dimension that preserves the mutual information between predictors and target variables. To this end, they propose a hybrid discriminator architecture that enforces an explicit dimensional bottleneck while enabling nonlinear cross-view interactions. A single-pass inference protocol is introduced to directly extract the effective dimensionality from over-parameterized models. By integrating neural mutual information estimation with the information bottleneck framework, the method demonstrates consistent efficacy on both synthetic and physical datasets, significantly outperforming conventional geometric dimensionality estimators—particularly in noisy conditions.

Technology Category

Application Category

📝 Abstract
Estimating the dimensionality of the latent representation needed for prediction -- the task-relevant dimension -- is a difficult, largely unsolved problem with broad scientific applications. We cast it as an Information Bottleneck question: what embedding bottleneck dimension is sufficient to compress predictor and predicted views while preserving their mutual information (MI). This repurposes neural MI estimators for dimensionality estimation. We show that standard neural estimators with separable/bilinear critics systematically inflate the inferred dimension, and we address this by introducing a hybrid critic that retains an explicit dimensional bottleneck while allowing flexible nonlinear cross-view interactions, thereby preserving the latent geometry. We further propose a one-shot protocol that reads off the effective dimension from a single over-parameterized hybrid model, without sweeping over bottleneck sizes. We validate the approach on synthetic problems with known task-relevant dimension. We extend the approach to intrinsic dimensionality by constructing paired views of a single dataset, enabling comparison with classical geometric dimension estimators. In noisy regimes where those estimators degrade, our approach remains reliable. Finally, we demonstrate the utility of the method on multiple physics datasets.
Problem

Research questions and friction points this paper is trying to address.

mutual information
latent dimensionality
task-relevant dimension
information bottleneck
intrinsic dimensionality
Innovation

Methods, ideas, or system contributions that make the work stand out.

mutual information
information bottleneck
latent dimensionality
hybrid critic
one-shot estimation
🔎 Similar Papers
No similar papers found.