Sampling and Identity-Testing Without Approximate Tensorization of Entropy

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses sampling and identity testing for high-dimensional *mixed distributions*—comprising a small number of components each satisfying the Approximate Tensorization of Entropy (ATE) property—but which themselves fail ATE globally. To handle such non-log-concave, high-dimensional distributions, we propose a data-initialized Gibbs dynamics framework coupled with coordinate-wise conditional sampling. We establish, for the first time under the Modified Logarithmic Sobolev Inequality (MLSI), rapid mixing of this dynamics. Leveraging this, we design an efficient sampler achieving optimal sample complexity for sampling, and resolve the open problem of identity testing in the coordinate query model posed by Blanca et al. Our algorithm substantially simplifies prior approaches while delivering theoretical breakthroughs: it attains both stronger mixing guarantees and improved computational efficiency. Moreover, it extends the applicability of local-to-global analysis techniques to broader classes of high-dimensional statistical inference problems.

Technology Category

Application Category

📝 Abstract
Certain tasks in high-dimensional statistics become easier when the underlying distribution satisfies a local-to-global property called approximate tensorization of entropy (ATE). For example, the Glauber dynamics Markov chain of an ATE distribution mixes fast and can produce approximate samples in a small amount of time, since such a distribution satisfies a modified log-Sobolev inequality. Moreover, identity-testing for an ATE distribution requires few samples if the tester is given coordinate conditional access to the unknown distribution, as shown by Blanca, Chen, Štefankovič, and Vigoda (COLT 2023). A natural class of distributions that do not satisfy ATE consists of mixtures of (few) distributions that do satisfy ATE. We study the complexity of identity-testing and sampling for these distributions. Our main results are the following: 1. We show fast mixing of Glauber dynamics from a data-based initialization, with optimal sample complexity, for mixtures of distributions satisfying modified log-Sobolev inequalities. This extends work of Huang, Koehler, Lee, Mohanty, Rajaraman, Vuong, and Wu (STOC 2025, COLT 2025) for mixtures of distributions satisfying Poincaré inequalities. 2. Answering an open question posed by Blanca et al., we give efficient identity-testers for mixtures of ATE distributions in the coordinate-conditional sampling access model. We also give some simplifications and improvements to the original algorithm of Blanca et al.
Problem

Research questions and friction points this paper is trying to address.

Study identity-testing for mixtures of ATE distributions
Analyze sampling complexity for non-ATE distributions
Improve algorithms for coordinate-conditional sampling access
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast mixing Glauber dynamics for mixtures
Efficient identity-testers for ATE mixtures
Coordinate-conditional sampling access model
🔎 Similar Papers
No similar papers found.