MARIA: a Multimodal Transformer Model for Incomplete Healthcare Data

📅 2024-12-19
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Medical multimodal data often suffer from pervasive missingness, and conventional imputation methods introduce bias and impair model robustness. To address this, we propose the first end-to-end multimodal Transformer framework that operates without data imputation. Our method features: (1) a middle-fusion architecture with learnable modality-specific positional encodings; (2) a missingness-aware adaptive masked self-attention mechanism that directly models only the available subset of modalities; and (3) a multi-task joint optimization strategy. Evaluated on eight clinical diagnosis and prognosis tasks, our approach consistently outperforms ten state-of-the-art baselines. Under 40%–80% random missingness, it retains over 92% of baseline performance while reducing relative error by 37%. The framework demonstrates significantly improved generalization to arbitrary missing patterns and enhanced robustness—without relying on imputation or auxiliary reconstruction objectives.

Technology Category

Application Category

📝 Abstract
In healthcare, the integration of multimodal data is pivotal for developing comprehensive diagnostic and predictive models. However, managing missing data remains a significant challenge in real-world applications. We introduce MARIA (Multimodal Attention Resilient to Incomplete datA), a novel transformer-based deep learning model designed to address these challenges through an intermediate fusion strategy. Unlike conventional approaches that depend on imputation, MARIA utilizes a masked self-attention mechanism, which processes only the available data without generating synthetic values. This approach enables it to effectively handle incomplete datasets, enhancing robustness and minimizing biases introduced by imputation methods. We evaluated MARIA against 10 state-of-the-art machine learning and deep learning models across 8 diagnostic and prognostic tasks. The results demonstrate that MARIA outperforms existing methods in terms of performance and resilience to varying levels of data incompleteness, underscoring its potential for critical healthcare applications.
Problem

Research questions and friction points this paper is trying to address.

Handling missing multimodal healthcare data effectively
Avoiding biases from traditional data imputation methods
Improving diagnostic and predictive model robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal transformer model for incomplete healthcare data
Masked self-attention mechanism avoids data imputation
Intermediate fusion strategy enhances robustness
🔎 Similar Papers
No similar papers found.
Camillo Maria Caruso
Camillo Maria Caruso
PhD student, Università Campus Bio-Medico di Roma
artificial intelligencedeep learningcomputer vision
P
P. Soda
Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Roma, Italy, Europe; Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, Umeå, Sweden, Europe
Valerio Guarrasi
Valerio Guarrasi
Università Campus Bio-Medico di Roma, Italy
Artificial IntelligenceMachine LearningMultimodal Deep LearningGenerative AI