🤖 AI Summary
Precise characterization of convergence rates for general-state-space Markov chains—especially in MCMC and stochastic optimization—remains challenging, as classical analytical methods often fail and yield impractical bounds.
Method: We propose DCDC, the first sample-based, universal convergence analysis framework. DCDC models a compressible drift equation (CDE) and employs a deep neural network solver to automatically learn and verify explicit, tight convergence bounds under the Wasserstein distance.
Contribution/Results: DCDC is the first to deeply integrate contraction drift theory with deep learning, eliminating reliance on prior structural knowledge of the chain. Its theoretical analysis provides rigorous sample complexity guarantees. Experiments on realistic chains—including stochastic processing networks and constant-step-size SGD—demonstrate that DCDC yields both valid and significantly tighter bounds than existing methods, substantially enhancing the practicality and scalability of convergence analysis.
📝 Abstract
Convergence rate analysis for general state-space Markov chains is fundamentally important in areas such as Markov chain Monte Carlo and algorithmic analysis (for computing explicit convergence bounds). This problem, however, is notoriously difficult because traditional analytical methods often do not generate practically useful convergence bounds for realistic Markov chains. We propose the Deep Contractive Drift Calculator (DCDC), the first general-purpose sample-based algorithm for bounding the convergence of Markov chains to stationarity in Wasserstein distance. The DCDC has two components. First, inspired by the new convergence analysis framework in (Qu et.al, 2023), we introduce the Contractive Drift Equation (CDE), the solution of which leads to an explicit convergence bound. Second, we develop an efficient neural-network-based CDE solver. Equipped with these two components, DCDC solves the CDE and converts the solution into a convergence bound. We analyze the sample complexity of the algorithm and further demonstrate the effectiveness of the DCDC by generating convergence bounds for realistic Markov chains arising from stochastic processing networks as well as constant step-size stochastic optimization.