🤖 AI Summary
This work addresses the challenge of efficient sampling from high-dimensional lattice fibers—defined by algebraic constraints such as those in log-linear models—in algebraic statistics. Methodologically, we propose a novel multilevel discrete Monte Carlo algorithm that integrates Markov basis construction with the multilevel Monte Carlo (MLMC) paradigm. We introduce the Fiber Coverage Score (FCS), a Voronoi-partition-based metric guiding hierarchical design, and employ Maximum Mean Discrepancy (MMD) to rigorously quantify sample quality. Our key contribution is the first systematic adaptation of MLMC to discrete algebraic sampling, which substantially reduces Markov chain burn-in time and enhances state-space exploration efficiency. Experiments on standard benchmark fibers demonstrate significantly faster convergence and higher sampling efficiency compared to naive MCMC, validating both the theoretical soundness and practical utility of the multilevel strategy in algebraic statistical inference.
📝 Abstract
This paper proposes a multilevel sampling algorithm for fiber sampling problems in algebraic statistics, inspired by Henry Wynn's suggestion to adapt multilevel Monte Carlo (MLMC) ideas to discrete models. Focusing on log-linear models, we sample from high-dimensional lattice fibers defined by algebraic constraints. Building on Markov basis methods and results from Diaconis and Sturmfels, our algorithm uses variable step sizes to accelerate exploration and reduce the need for long burn-in. We introduce a novel Fiber Coverage Score (FCS) based on Voronoi partitioning to assess sample quality, and highlight the utility of the Maximum Mean Discrepancy (MMD) quality metric. Simulations on benchmark fibers show that multilevel sampling outperforms naive MCMC approaches. Our results demonstrate that multilevel methods, when properly applied, provide practical benefits for discrete sampling in algebraic statistics.