MCE: Towards a General Framework for Handling Missing Modalities under Imbalanced Missing Rates

📅 2025-10-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multimodal learning, imbalanced modality missing rates cause inconsistent learning progress and representation degradation, creating a vicious cycle of performance deterioration. Existing methods primarily address dataset-level balancing while neglecting sample-level dynamic variations in modality utility and the intrinsic decline in feature quality. To tackle this, we propose MCE, a general-purpose framework featuring two novel modules: (1) Learning Capability Enhancement, which dynamically adjusts per-sample learning progress across modalities via multi-level factors; and (2) Representation Capability Enhancement, which improves semantic richness and robustness of features through subset prediction and cross-modal completion tasks. Integrated with multimodal fusion and contrastive learning, MCE effectively breaks the performance-degradation cycle. Extensive experiments demonstrate that MCE consistently outperforms state-of-the-art methods across four benchmarks under diverse missing-rate configurations, achieving new best results. The code is publicly available.

Technology Category

Application Category

📝 Abstract
Multi-modal learning has made significant advances across diverse pattern recognition applications. However, handling missing modalities, especially under imbalanced missing rates, remains a major challenge. This imbalance triggers a vicious cycle: modalities with higher missing rates receive fewer updates, leading to inconsistent learning progress and representational degradation that further diminishes their contribution. Existing methods typically focus on global dataset-level balancing, often overlooking critical sample-level variations in modality utility and the underlying issue of degraded feature quality. We propose Modality Capability Enhancement (MCE) to tackle these limitations. MCE includes two synergistic components: i) Learning Capability Enhancement (LCE), which introduces multi-level factors to dynamically balance modality-specific learning progress, and ii) Representation Capability Enhancement (RCE), which improves feature semantics and robustness through subset prediction and cross-modal completion tasks. Comprehensive evaluations on four multi-modal benchmarks show that MCE consistently outperforms state-of-the-art methods under various missing configurations. The journal preprint version is now available at https://doi.org/10.1016/j.patcog.2025.112591. Our code is available at https://github.com/byzhaoAI/MCE.
Problem

Research questions and friction points this paper is trying to address.

Handling missing modalities in multi-modal learning
Addressing imbalanced missing rates across modalities
Improving degraded feature quality from missing data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic balancing of modality-specific learning progress
Enhancing feature semantics through subset prediction tasks
Improving robustness via cross-modal completion techniques
🔎 Similar Papers
No similar papers found.
B
Binyu Zhao
The School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
W
Wei Zhang
The School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
Zhaonian Zou
Zhaonian Zou
Harbin Institute of Technology, China
DatabasesData Mining