Measuring Sample Quality with Copula Discrepancies

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In modern Bayesian inference, scalable MCMC methods (e.g., SGLD) introduce bias that invalidates conventional sample-quality diagnostics—such as effective sample size (ESS)—particularly for assessing multivariate dependence structures, a core inferential objective. To address this, we propose the Copula Discrepancy (CD) diagnostic, which leverages Sklar’s theorem to decouple and quantify fidelity of dependence structure in biased samples, establishing the first structural-aware framework for evaluating biased samplers. CD detects tail-dependence mismatches—even when Kendall’s tau agrees—thereby distinguishing divergent extremal event behaviors. We implement CD via moment estimation and a robust MLE variant, achieving significantly lower computational overhead than Stein-based alternatives. Experiments demonstrate that CD outperforms ESS and other standard metrics in hyperparameter selection: it precisely identifies optimal configurations and uncovers critical dependence biases missed by rank-correlation–based diagnostics.

Technology Category

Application Category

📝 Abstract
The scalable Markov chain Monte Carlo (MCMC) algorithms that underpin modern Bayesian machine learning, such as Stochastic Gradient Langevin Dynamics (SGLD), sacrifice asymptotic exactness for computational speed, creating a critical diagnostic gap: traditional sample quality measures fail catastrophically when applied to biased samplers. While powerful Stein-based diagnostics can detect distributional mismatches, they provide no direct assessment of dependence structure, often the primary inferential target in multivariate problems. We introduce the Copula Discrepancy (CD), a principled and computationally efficient diagnostic that leverages Sklar's theorem to isolate and quantify the fidelity of a sample's dependence structure independent of its marginals. Our theoretical framework provides the first structure-aware diagnostic specifically designed for the era of approximate inference. Empirically, we demonstrate that a moment-based CD dramatically outperforms standard diagnostics like effective sample size for hyperparameter selection in biased MCMC, correctly identifying optimal configurations where traditional methods fail. Furthermore, our robust MLE-based variant can detect subtle but critical mismatches in tail dependence that remain invisible to rank correlation-based approaches, distinguishing between samples with identical Kendall's tau but fundamentally different extreme-event behavior. With computational overhead orders of magnitude lower than existing Stein discrepancies, the CD provides both immediate practical value for MCMC practitioners and a theoretical foundation for the next generation of structure-aware sample quality assessment.
Problem

Research questions and friction points this paper is trying to address.

Assessing sample quality in biased MCMC algorithms
Evaluating dependence structure in multivariate samples
Detecting subtle mismatches in tail dependence behavior
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages Sklar's theorem for dependence structure
Introduces moment-based Copula Discrepancy diagnostic
Detects tail dependence mismatches efficiently
🔎 Similar Papers
No similar papers found.
A
Agnideep Aich
Department of Mathematics, University of Louisiana at Lafayette, Lafayette, Louisiana, USA.
Ashit Baran Aich
Ashit Baran Aich
Former Professor of Statistics, Presidency College
StatisticsStatistical Machine LearningProbabilityStatistical LearningDeep Learning
B
Bruce Wade
Department of Mathematics, University of Louisiana at Lafayette, Lafayette, Louisiana, USA.