SHMoAReg: Spark Deformable Image Registration via Spatial Heterogeneous Mixture of Experts and Attention Heads

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current deformable image registration methods suffer from two key limitations: (1) non-specific feature extraction and (2) homogeneous deformation field prediction across the three spatial dimensions (x, y, z). To address these, this work introduces, for the first time, a Mixture-of-Experts (MoE) mechanism into deformable registration. We propose a dynamically specialized network: at the encoder, a Mixture of Attention Heads enables adaptive multi-scale feature selection; at the decoder, a Spatially Heterogeneous MoE employs expert branches with distinct receptive fields to independently predict deformation fields along x, y, and z axes. Evaluated on public abdominal CT datasets, our method improves the Dice score from 60.58% to 65.58%, significantly enhancing registration accuracy. Moreover, it improves model interpretability by revealing the collaborative, scale- and direction-specific specialization of experts.

Technology Category

Application Category

📝 Abstract
Encoder-Decoder architectures are widely used in deep learning-based Deformable Image Registration (DIR), where the encoder extracts multi-scale features and the decoder predicts deformation fields by recovering spatial locations. However, current methods lack specialized extraction of features (that are useful for registration) and predict deformation jointly and homogeneously in all three directions. In this paper, we propose a novel expert-guided DIR network with Mixture of Experts (MoE) mechanism applied in both encoder and decoder, named SHMoAReg. Specifically, we incorporate Mixture of Attention heads (MoA) into encoder layers, while Spatial Heterogeneous Mixture of Experts (SHMoE) into the decoder layers. The MoA enhances the specialization of feature extraction by dynamically selecting the optimal combination of attention heads for each image token. Meanwhile, the SHMoE predicts deformation fields heterogeneously in three directions for each voxel using experts with varying kernel sizes. Extensive experiments conducted on two publicly available datasets show consistent improvements over various methods, with a notable increase from 60.58% to 65.58% in Dice score for the abdominal CT dataset. Furthermore, SHMoAReg enhances model interpretability by differentiating experts' utilities across/within different resolution layers. To the best of our knowledge, we are the first to introduce MoE mechanism into DIR tasks. The code will be released soon.
Problem

Research questions and friction points this paper is trying to address.

Improving feature extraction specialization for deformable image registration
Enabling heterogeneous deformation prediction in three spatial directions
Enhancing model interpretability through mixture of experts mechanism
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Attention heads for dynamic feature extraction
Spatial Heterogeneous MoE for directional deformation prediction
First application of Mixture of Experts in image registration
🔎 Similar Papers
No similar papers found.
Y
Yuxi Zheng
Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China
J
Jianhui Feng
Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China
T
Tianran Li
Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai, China
Marius Staring
Marius Staring
Leiden University Medical Center
Bio-medical image analysismachine learningimage registration
Yuchuan Qiao
Yuchuan Qiao
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University
Image registrationProton therapyMulti-modal image analysis