GeoIB: Geometry-Aware Information Bottleneck via Statistical-Manifold Compression

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of traditional information bottleneck (IB) methods, which rely on surrogate estimates of mutual information and suffer from loose optimization, estimation bias, and inadequate control over compression. From an information-geometric perspective, the authors reformulate the IB principle by unifying compression and prediction as Kullback–Leibler projections on a statistical manifold. They introduce a distribution-level Fisher–Rao metric and a geometric Jacobian–Frobenius regularizer, jointly governed by a single bottleneck coefficient. This approach eliminates the need for mutual information estimation and enables, for the first time, direct and stable joint optimization of compression and prediction. Experiments demonstrate that the proposed GeoIB significantly outperforms existing baselines across multiple standard datasets, achieving superior trade-offs between accuracy and compression in the information plane, while also enhancing model invariance and training stability.

Technology Category

Application Category

📝 Abstract
Information Bottleneck (IB) is widely used, but in deep learning, it is usually implemented through tractable surrogates, such as variational bounds or neural mutual information (MI) estimators, rather than directly controlling the MI I(X;Z) itself. The looseness and estimator-dependent bias can make IB"compression"only indirectly controlled and optimization fragile. We revisit the IB problem through the lens of information geometry and propose a \textbf{Geo}metric \textbf{I}nformation \textbf{B}ottleneck (\textbf{GeoIB}) that dispenses with mutual information (MI) estimation. We show that I(X;Z) and I(Z;Y) admit exact projection forms as minimal Kullback-Leibler (KL) distances from the joint distributions to their respective independence manifolds. Guided by this view, GeoIB controls information compression with two complementary terms: (i) a distribution-level Fisher-Rao (FR) discrepancy, which matches KL to second order and is reparameterization-invariant; and (ii) a geometry-level Jacobian-Frobenius (JF) term that provides a local capacity-type upper bound on I(Z;X) by penalizing pullback volume expansion of the encoder. We further derive a natural-gradient optimizer consistent with the FR metric and prove that the standard additive natural-gradient step is first-order equivalent to the geodesic update. We conducted extensive experiments and observed that the GeoIB achieves a better trade-off between prediction accuracy and compression ratio in the information plane than the mainstream IB baselines on popular datasets. GeoIB improves invariance and optimization stability by unifying distributional and geometric regularization under a single bottleneck multiplier. The source code of GeoIB is released at"https://anonymous.4open.science/r/G-IB-0569".
Problem

Research questions and friction points this paper is trying to address.

Information Bottleneck
Mutual Information Estimation
Compression
Optimization Stability
Deep Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Bottleneck
Information Geometry
Fisher-Rao Metric
Natural Gradient
Mutual Information Estimation
🔎 Similar Papers
No similar papers found.