๐ค AI Summary
To address the cross-modal high-resolution synthesis of preoperative MRI and intraoperative ultrasound (US) images, this paper proposes a Hierarchical Mixture-of-Experts Variational Autoencoder (MMHVAE). The method tackles key challenges including multimodal latent distribution modeling, missing modality estimation, training with incomplete data, and effective information fusion. Technically, it introducesโ for the first timeโa hierarchical mixture-of-experts architecture to jointly model multimodal latent spaces; incorporates explicit variational inference to impute missing modalities and integrates dataset-level priors to enhance robustness; and synergistically combines Product-of-Experts (PoE) fusion with cross-modal latent alignment to achieve unified representation of multiparametric MRI and US. Evaluated on brain imaging data, MMHVAE achieves significant improvements in PSNR and SSIM. Synthesized images exhibit accurate anatomical structures and sharp textural details, enabling real-time intraoperative navigation.
๐ Abstract
We propose a deep mixture of multimodal hierarchical variational auto-encoders called MMHVAE that synthesizes missing images from observed images in different modalities. MMHVAE's design focuses on tackling four challenges: (i) creating a complex latent representation of multimodal data to generate high-resolution images; (ii) encouraging the variational distributions to estimate the missing information needed for cross-modal image synthesis; (iii) learning to fuse multimodal information in the context of missing data; (iv) leveraging dataset-level information to handle incomplete data sets at training time. Extensive experiments are performed on the challenging problem of pre-operative brain multi-parametric magnetic resonance and intra-operative ultrasound imaging.