A Hierarchical Decomposition of Kullback-Leibler Divergence: Disentangling Marginal Mismatches from Statistical Dependencies

📅 2025-04-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the diagnostic ambiguity in multivariate KL divergence arising from the entanglement of marginal mismatch and statistical dependencies. We propose an algebraically exact, fully additive hierarchical decomposition method based on Möbius inversion over the subset lattice—yielding the first closed-form, complete analytic decomposition of KL divergence. The total divergence between a joint distribution and its product reference is rigorously disentangled into a sum of independent marginal mismatch terms and all-order (r-wise) statistical dependency terms, expressed solely in terms of standard Shannon information measures, without approximations or modeling assumptions. The framework unifies higher-order mutual information, total correlation, and information-geometric principles. Numerical experiments confirm machine-precision accuracy across diverse systems, substantially enhancing attribution-based diagnostic capability for divergence sources in machine learning, econometrics, and complex systems analysis.

Technology Category

Application Category

📝 Abstract
The Kullback-Leibler (KL) divergence is a foundational measure for comparing probability distributions. Yet in multivariate settings, its structure is often opaque, conflating marginal mismatches and statistical dependencies. We derive an algebraically exact, additive, and hierarchical decomposition of the KL divergence between a joint distribution ( P_k ) and a product reference ( Q^{otimes k} ). The total divergence splits into the sum of marginal KLs, ( sum_{i=1}^k mathrm{KL}(P_i | Q) ), and the total correlation ( C(P_k) ), which we further decompose as ( C(P_k) = sum_{r=2}^k I^{(r)}(P_k) ), using Moebius inversion on the subset lattice. Each ( I^{(r)} ) quantifies the distinct contribution of ( r )-way statistical interactions to the total divergence. This yields the first decomposition of this form that is both algebraically complete and interpretable using only standard Shannon quantities, with no approximations or model assumptions. Numerical validation using hypergeometric sampling confirms exactness to machine precision across diverse system configurations. This framework enables precise diagnosis of divergence origins, marginal versus interaction, across applications in machine learning, econometrics, and complex systems.
Problem

Research questions and friction points this paper is trying to address.

Decompose KL divergence into marginal and dependency components
Quantify distinct contributions of multi-way statistical interactions
Enable precise diagnosis of divergence origins in applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical decomposition of KL divergence
Moebius inversion on subset lattice
Exact marginal and interaction quantification
🔎 Similar Papers
No similar papers found.