🤖 AI Summary
This work addresses the sign problem in quantum Monte Carlo simulations, wherein statistical errors grow exponentially with system size. The authors propose a control variate method based on autoregressive Transformers, training a pair of models with disjoint support sets to structurally enforce zero mean in the control variates. By incorporating an end-to-end parity mask, the approach accurately resolves sign sectors. Embedded within a stochastic series expansion framework, the method employs incremental loop-topology updates and twist channels to enable ergodic sampling across sign sectors. Topological features—such as changes in loop counts and cumulative frustration parity—are leveraged to enhance representational capacity. In small-system benchmarks of the triangular-lattice Heisenberg antiferromagnet, the technique reduces the standard error of the average sign by nearly an order of magnitude and decreases energy estimation errors by 3–5×, remaining efficient even when the average sign drops as low as 10⁻³.
📝 Abstract
We train a pair of autoregressive models to construct zero-mean control variates to mitigate the sign problem in quantum Monte Carlo simulations. The two autoregressive networks are confined to the positive- and negative-sign sectors with strictly disjoint support, and each is exactly normalized over its sector. Their difference is therefore structurally zero-mean, providing an unbiased auxiliary observable whose correlation with the sign estimator controls the variance reduction. We implement the method within the stochastic series expansion framework, which we extend to frustrated lattices by developing an incremental loop-topology update. Sign-ergodic sampling is achieved through a twist channel, which is the unique sign-changing mechanism on non-bipartite lattices. We implement the control variates as autoregressive transformers with an end-of-sequence parity mask that enforces exact sign-sector resolution, while the incremental loop-count change and cumulative frustration parity are incorporated as topological features. On the triangular-lattice Heisenberg antiferromagnet, we benchmark the method in the small-$N$ limit. The control variate reduces the standard error of the average sign by up to an order of magnitude and that of the energy estimator by a factor of three to five, remaining effective even when the average sign drops below $10^{-3}$. This work lays out the framework and provides a proof-of-principle demonstration that autoregressive control variates can effectively mitigate the sign problem. Scaling to larger systems with physics-informed architectures is the subject of future work.