DEMIST: underline{DE}coupled underline{M}ulti-stream latent dunderline{I}ffusion for Quantitative Myelin Map underline{S}ynunderline{T}hesis

📅 2025-11-15

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the clinical impracticality of quantitative myelin imaging (qMT) due to its long acquisition time (20–30 minutes), this work proposes a novel method for synthesizing quantitative myelin biomarkers—specifically, pool size ratio (PSR) maps—from routine T1-weighted (T1w) and FLAIR MRI scans. We introduce a 3D latent diffusion model augmented with a decoupled multi-stream conditional mechanism: integrating semantic cross-attention, scale-wise 3D ControlNet residual guidance, and LoRA-modulated attention—enabling substantial improvements in boundary sharpness and quantitative consistency with minimal parameter tuning. A two-stage training strategy is employed: first aligning latent spaces, then freezing the diffusion backbone while jointly optimizing edge-aware and alignment losses. Evaluated on 163 clinically acquired scans via five-fold cross-validation, our method consistently outperforms VAE-, GAN-, and standard diffusion-based baselines, yielding PSR maps with superior anatomical fidelity and quantitative accuracy. Code is publicly available.

Technology Category

Application Category

📝 Abstract

Quantitative magnetization transfer (qMT) imaging provides myelin-sensitive biomarkers, such as the pool size ratio (PSR), which is valuable for multiple sclerosis (MS) assessment. However, qMT requires specialized 20-30 minute scans. We propose DEMIST to synthesize PSR maps from standard T1w and FLAIR images using a 3D latent diffusion model with three complementary conditioning mechanisms. Our approach has two stages: first, we train separate autoencoders for PSR and anatomical images to learn aligned latent representations. Second, we train a conditional diffusion model in this latent space on top of a frozen diffusion foundation backbone. Conditioning is decoupled into: (i) extbf{semantic} tokens via cross-attention, (ii) extbf{spatial} per-scale residual hints via a 3D ControlNet branch, and (iii) extbf{adaptive} LoRA-modulated attention. We include edge-aware loss terms to preserve lesion boundaries and alignment losses to maintain quantitative consistency, while keeping the number of trainable parameters low and retaining the inductive bias of the pretrained model. We evaluate on 163 scans from 99 subjects using 5-fold cross-validation. Our method outperforms VAE, GAN and diffusion baselines on multiple metrics, producing sharper boundaries and better quantitative agreement with ground truth. Our code is publicly available at https://github.com/MedICL-VU/MS-Synthesis-3DcLDM.

Problem

Research questions and friction points this paper is trying to address.

Synthesizes myelin-sensitive PSR maps from standard MRI scans

Reduces specialized qMT scan time from 20-30 minutes

Improves quantitative myelin mapping for MS assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 3D latent diffusion model with multi-stream conditioning

Decouples conditioning into semantic, spatial, and adaptive mechanisms

Implements two-stage training with aligned latent representations

🔎 Similar Papers

Completed Feature Disentanglement Learning for Multimodal MRIs Analysis