🤖 AI Summary
Mean-field variational inference (MFVI) suffers from low approximation accuracy due to its assumption of posterior independence among latent variables. To address this, we propose an efficient Gaussianization-based variational inference method leveraging coordinate rotation. Our approach first applies principal component analysis to orthogonally rotate the parameter space, thereby reducing posterior correlations among variables. Subsequently, it iteratively constructs coordinate-wise invertible mappings by jointly leveraging the score function of the target distribution and cross-covariance estimation, enabling progressive Gaussianization of the posterior. Crucially, the method preserves the computational efficiency of MFVI while significantly improving approximation fidelity. Experiments demonstrate that our method outperforms standard MFVI on Bayesian inference tasks, achieving superior accuracy at a fraction of the computational cost associated with expressive generative variational methods such as normalizing flows.
📝 Abstract
We propose to perform mean-field variational inference (MFVI) in a rotated coordinate system that reduces correlations between variables. The rotation is determined by principal component analysis (PCA) of a cross-covariance matrix involving the target's score function. Compared with standard MFVI along the original axes, MFVI in this rotated system often yields substantially more accurate approximations with negligible additional cost.
MFVI in a rotated coordinate system defines a rotation and a coordinatewise map that together move the target closer to Gaussian. Iterating this procedure yields a sequence of transformations that progressively transforms the target toward Gaussian. The resulting algorithm provides a computationally efficient way to construct flow-like transport maps: it requires only MFVI subproblems, avoids large-scale optimization, and yields transformations that are easy to invert and evaluate. In Bayesian inference tasks, we demonstrate that the proposed method achieves higher accuracy than standard MFVI, while maintaining much lower computational cost than conventional normalizing flows.