MultiMAE for Brain MRIs: Robustness to Missing Inputs Using Multi-Modal Masked Autoencoder

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the common issue of missing sequences in multimodal brain MRI data—which degrades model robustness—this paper proposes a pretraining framework based on a Multimodal Masked Autoencoder (MultiMAE). Methodologically, each MRI sequence is treated as an independent modality; a 3D Transformer encoder with late-fusion is employed to capture cross-sequence dependencies, and a multi-decoder reconstruction architecture enables self-supervised learning and cross-modal inference under incomplete inputs. The core contributions are a modality-aware masking mechanism and a decoupled multi-task reconstruction objective, allowing the model to infer missing modalities from available ones. In downstream segmentation and classification tasks, MultiMAE achieves a +10.1 absolute Dice score improvement and a +0.46 Matthews Correlation Coefficient gain over the MAE-ViT baseline under missing-input conditions, demonstrating significantly enhanced generalization and flexibility in downstream adaptation.

Technology Category

Application Category

📝 Abstract
Missing input sequences are common in medical imaging data, posing a challenge for deep learning models reliant on complete input data. In this work, inspired by MultiMAE [2], we develop a masked autoencoder (MAE) paradigm for multi-modal, multi-task learning in 3D medical imaging with brain MRIs. Our method treats each MRI sequence as a separate input modality, leveraging a late-fusion-style transformer encoder to integrate multi-sequence information (multi-modal) and individual decoder streams for each modality for multi-task reconstruction. This pretraining strategy guides the model to learn rich representations per modality while also equipping it to handle missing inputs through cross-sequence reasoning. The result is a flexible and generalizable encoder for brain MRIs that infers missing sequences from available inputs and can be adapted to various downstream applications. We demonstrate the performance and robustness of our method against an MAE-ViT baseline in downstream segmentation and classification tasks, showing absolute improvement of $10.1$ overall Dice score and $0.46$ MCC over the baselines with missing input sequences. Our experiments demonstrate the strength of this pretraining strategy. The implementation is made available.
Problem

Research questions and friction points this paper is trying to address.

Handling missing MRI sequences in medical imaging
Learning robust multi-modal representations for brain MRIs
Enabling cross-sequence reasoning to infer missing inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked autoencoder for multi-modal MRI learning
Late-fusion transformer integrates missing sequence reasoning
Individual decoders enable multi-task reconstruction robustness
Ayhan Can Erdur
Ayhan Can Erdur
Technical University of Munich
Deep LearningComputer VisionMedical Imaging3D SegmentationSurvival Analysis
C
Christian Beischl
Chair for AI in Healthcare and Medicine, Technical University of Munich (TUM) and TUM University Hospital, Munich, Germany
D
Daniel Scholz
Chair for AI in Healthcare and Medicine, Technical University of Munich (TUM) and TUM University Hospital, Munich, Germany; Munich Center for Machine Learning (MCML), Munich, Germany; Department of Computing, Imperial College London, London, UK
Jiazhen Pan
Jiazhen Pan
Technical University of Munich
Machine LearningMedical Image ComputingBiomedical Image Analysis
B
Benedikt Wiestler
Chair for AI for Image-Guided Diagnosis and Therapy, Technical University of Munich (TUM) and TUM University Hospital, Munich, Germany; Munich Center for Machine Learning (MCML), Munich, Germany
Daniel Rueckert
Daniel Rueckert
Technical University of Munich and Imperial College London
Machine LearningMedical Image ComputingBiomedical Image AnalysisComputer Vision
J
Jan C. Peeken
Department of Radiation Oncology, TUM University Hospital, Munich, Germany; Deutsches Konsortium für Translationale Krebsforschung (DKTK), Partner Site Munich, Munich, Germany; Institute of Radiation Medicine (IRM), Department of Radiation Sciences (DRS), Helmholtz Center Munich, Munich, Germany