🤖 AI Summary
Cryo-electron microscopy (cryo-EM) 3D reconstruction faces challenges in modeling structural heterogeneity, as conventional discrete conformational assumptions fail to capture continuous molecular dynamics. This work reformulates reconstruction as a stochastic inverse problem on the space of probability measures, uniquely characterizing conformational heterogeneity as a continuous probability distribution. We propose a variational optimization framework based on Wasserstein gradient flows and establish its convergence and consistency with infinite-dimensional maximum a posteriori (MAP) estimation. To quantify distributional discrepancy, we design a composite statistical distance objective integrating the Kullback–Leibler (KL) divergence and maximum mean discrepancy (MMD), solved efficiently via a particle-based scheme. Evaluated on both synthetic data and real protein systems, our method achieves high-resolution reconstruction of continuous conformational distributions, markedly improving resolution of molecular dynamic diversity and robustness to noise and sampling variability.
📝 Abstract
Cryo-electron microscopy (Cryo-EM) enables high-resolution imaging of biomolecules, but structural heterogeneity remains a major challenge in 3D reconstruction. Traditional methods assume a discrete set of conformations, limiting their ability to recover continuous structural variability. In this work, we formulate cryo-EM reconstruction as a stochastic inverse problem (SIP) over probability measures, where the observed images are modeled as the push-forward of an unknown distribution over molecular structures via a random forward operator. We pose the reconstruction problem as the minimization of a variational discrepancy between observed and simulated image distributions, using statistical distances such as the KL divergence and the Maximum Mean Discrepancy. The resulting optimization is performed over the space of probability measures via a Wasserstein gradient flow, which we numerically solve using particles to represent and evolve conformational ensembles. We validate our approach using synthetic examples, including a realistic protein model, which demonstrates its ability to recover continuous distributions over structural states. We analyze the connection between our formulation and Maximum A Posteriori (MAP) approaches, which can be interpreted as instances of the discretize-then-optimize (DTO) framework. We further provide a consistency analysis, establishing conditions under which DTO methods, such as MAP estimation, converge to the solution of the underlying infinite-dimensional continuous problem. Beyond cryo-EM, the framework provides a general methodology for solving SIPs involving random forward operators.