🤖 AI Summary
Medical multimodal data often suffer from pervasive missingness, and conventional imputation methods introduce bias and impair model robustness. To address this, we propose the first end-to-end multimodal Transformer framework that operates without data imputation. Our method features: (1) a middle-fusion architecture with learnable modality-specific positional encodings; (2) a missingness-aware adaptive masked self-attention mechanism that directly models only the available subset of modalities; and (3) a multi-task joint optimization strategy. Evaluated on eight clinical diagnosis and prognosis tasks, our approach consistently outperforms ten state-of-the-art baselines. Under 40%–80% random missingness, it retains over 92% of baseline performance while reducing relative error by 37%. The framework demonstrates significantly improved generalization to arbitrary missing patterns and enhanced robustness—without relying on imputation or auxiliary reconstruction objectives.
📝 Abstract
In healthcare, the integration of multimodal data is pivotal for developing comprehensive diagnostic and predictive models. However, managing missing data remains a significant challenge in real-world applications. We introduce MARIA (Multimodal Attention Resilient to Incomplete datA), a novel transformer-based deep learning model designed to address these challenges through an intermediate fusion strategy. Unlike conventional approaches that depend on imputation, MARIA utilizes a masked self-attention mechanism, which processes only the available data without generating synthetic values. This approach enables it to effectively handle incomplete datasets, enhancing robustness and minimizing biases introduced by imputation methods. We evaluated MARIA against 10 state-of-the-art machine learning and deep learning models across 8 diagnostic and prognostic tasks. The results demonstrate that MARIA outperforms existing methods in terms of performance and resilience to varying levels of data incompleteness, underscoring its potential for critical healthcare applications.