Contrastive Predictive Coding Done Right for Mutual Information Estimation

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
InfoNCE, widely adopted for mutual information (MI) estimation, suffers from systematic bias and fails to provide consistent estimates of true MI. To address this, we propose InfoNCE-anchor: a plugin-style, consistent MI estimator derived by augmenting InfoNCE with learnable auxiliary anchor classes, thereby substantially reducing density-ratio estimation bias. Building upon scoring rule theory, we further establish a unified framework that reveals fundamental connections among contrastive learning objectives. Experiments demonstrate that InfoNCE-anchor achieves state-of-the-art accuracy in MI estimation. However, this improvement does not translate into gains on downstream self-supervised tasks, suggesting that representation learning relies more critically on structured density-ratio modeling than on precise MI数值. Our work provides new theoretical insights into the foundations of contrastive learning and the reliability of MI estimation.

Technology Category

Application Category

📝 Abstract
The InfoNCE objective, originally introduced for contrastive representation learning, has become a popular choice for mutual information (MI) estimation, despite its indirect connection to MI. In this paper, we demonstrate why InfoNCE should not be regarded as a valid MI estimator, and we introduce a simple modification, which we refer to as InfoNCE-anchor, for accurate MI estimation. Our modification introduces an auxiliary anchor class, enabling consistent density ratio estimation and yielding a plug-in MI estimator with significantly reduced bias. Beyond this, we generalize our framework using proper scoring rules, which recover InfoNCE-anchor as a special case when the log score is employed. This formulation unifies a broad spectrum of contrastive objectives, including NCE, InfoNCE, and $f$-divergence variants, under a single principled framework. Empirically, we find that InfoNCE-anchor with the log score achieves the most accurate MI estimates; however, in self-supervised representation learning experiments, we find that the anchor does not improve the downstream task performance. These findings corroborate that contrastive representation learning benefits not from accurate MI estimation per se, but from the learning of structured density ratios.
Problem

Research questions and friction points this paper is trying to address.

Demonstrating InfoNCE's invalidity as mutual information estimator
Introducing anchor modification for accurate mutual information estimation
Unifying contrastive objectives under principled scoring rule framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces anchor class for density ratio estimation
Generalizes framework using proper scoring rules
Unifies contrastive objectives under principled framework
🔎 Similar Papers
No similar papers found.
J
J. Jon Ryu
Department of EECS, MIT, Cambridge, MA 02139, USA
P
Pavan Yeddanapudi
Department of EECS, MIT, Cambridge, MA 02139, USA
Xiangxiang Xu
Xiangxiang Xu
University of Rochester
Information TheoryInformation ProcessingMachine Learning TheoryRepresentation Learning
G
Gregory W. Wornell
Department of EECS, MIT, Cambridge, MA 02139, USA