🤖 AI Summary
This work addresses the challenge of performing Bayesian inference in unnormalized models, where the intractable normalization constant hinders conventional approaches. The authors propose a fully Bayesian inference framework that treats the normalization constant as an unknown parameter and reformulates the inference problem as a binary classification task between observed data and noise samples, leveraging noise contrastive estimation. By integrating Pólya–Gamma data augmentation with Gibbs sampling, the method efficiently handles exponential-family unnormalized models without requiring tuning parameters, thereby circumventing the sensitivity to likelihood tempering that plagues existing techniques. Experiments on time-varying density point processes and sparse toroidal graphical models demonstrate that the approach yields accurate parameter estimates and reliable uncertainty quantification.
📝 Abstract
Unnormalized (or energy-based) models provide a flexible framework for capturing the characteristics of data with complex dependency structures. However, the application of standard Bayesian inference methods has been severely limited because the parameter-dependent normalizing constant is either analytically intractable or computationally prohibitive to evaluate. A promising approach is score-based generalized Bayesian inference, which avoids evaluating the normalizing constant by replacing the likelihood with a scoring rule. However, this approach requires careful tuning of the likelihood information, and it may fail to yield valid inference without appropriate control. To overcome this difficulty, we propose a fully Bayesian framework for inference on unnormalized models that does not require such tuning. We build on noise contrastive estimation, which recasts inference as a binary classification problem between observed and noise samples, and treat the normalizing constant as an additional unknown parameter within the resulting likelihood. For exponential families, the classification likelihood becomes conditionally Gaussian via Pólya-Gamma data augmentation, leading to a simple Gibbs sampler. We demonstrate the proposed approach through two models: time-varying density models for temporal point process data and sparse torus graph models for multivariate circular data. Through simulation studies and real-data analyses, the proposed method provides accurate point estimation and enables principled uncertainty quantification.