CLIMD: A Curriculum Learning Framework for Imbalanced Multimodal Diagnosis

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In clinical multimodal diagnosis, class imbalance severely impairs model performance on minority diseases; conventional resampling or loss-weighting approaches often induce overfitting or underfitting and neglect cross-modal interactions. To address this, we propose CLIMD, a curriculum learning framework featuring a novel multimodal curriculum assessment mechanism that jointly incorporates intra-modal confidence and inter-modal complementarity—enabling class-distribution-aware, progressive hard-sample learning. Additionally, CLIMD introduces a plug-and-play training scheduler guided by cross-modal complementarity, explicitly modeling both intra-modal discriminability and inter-modal synergy. Evaluated across multiple multimodal medical benchmarks, CLIMD consistently improves minority-class diagnostic accuracy, achieving average F1-score gains of 3.2–7.8%. The framework demonstrates strong generalizability and seamless integration flexibility with existing architectures.

Technology Category

Application Category

📝 Abstract
Clinicians usually combine information from multiple sources to achieve the most accurate diagnosis, and this has sparked increasing interest in leveraging multimodal deep learning for diagnosis. However, in real clinical scenarios, due to differences in incidence rates, multimodal medical data commonly face the issue of class imbalance, which makes it difficult to adequately learn the features of minority classes. Most existing methods tackle this issue with resampling or loss reweighting, but they are prone to overfitting or underfitting and fail to capture cross-modal interactions. Therefore, we propose a Curriculum Learning framework for Imbalanced Multimodal Diagnosis (CLIMD). Specifically, we first design multimodal curriculum measurer that combines two indicators, intra-modal confidence and inter-modal complementarity, to enable the model to focus on key samples and gradually adapt to complex category distributions. Additionally, a class distribution-guided training scheduler is introduced, which enables the model to progressively adapt to the imbalanced class distribution during training. Extensive experiments on multiple multimodal medical datasets demonstrate that the proposed method outperforms state-of-the-art approaches across various metrics and excels in handling imbalanced multimodal medical data. Furthermore, as a plug-and-play CL framework, CLIMD can be easily integrated into other models, offering a promising path for improving multimodal disease diagnosis accuracy. Code is publicly available at https://github.com/KHan-UJS/CLIMD.
Problem

Research questions and friction points this paper is trying to address.

Address class imbalance in multimodal medical data
Improve feature learning for minority classes
Enhance cross-modal interaction in diagnosis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal curriculum measurer for key samples
Class distribution-guided training scheduler
Plug-and-play framework for easy integration
🔎 Similar Papers
2024-01-02IEEE International Conference on Bioinformatics and BiomedicineCitations: 0
K
Kai Han
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
C
Chongwen Lyu
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
Lele Ma
Lele Ma
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
C
Chengxuan Qian
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
Siqi Ma
Siqi Ma
The University of Wollongong
CybersecuritySoftware EngineeringAI Security
Z
Zheng Pang
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
J
Jun Chen
School of Computer Science and Telecommunication Engineering, Jiangsu University, China
Z
Zhe Liu
School of Computer Science and Telecommunication Engineering, Jiangsu University, China