LeMoF: Level-guided Multimodal Fusion for Heterogeneous Clinical Data

πŸ“… 2026-01-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

242K/year
πŸ€– AI Summary
Existing multimodal clinical prediction approaches rely on static fusion strategies that struggle to effectively leverage modality-specific representations from heterogeneous data such as electronic health records and biosignals. To address this limitation, this work proposes LeMoF, a novel framework that employs a hierarchical guidance mechanism to dynamically select and fuse multi-granularity encoder representations within each modality, while simultaneously learning both global modality-level predictions and hierarchy-specific discriminative features. By moving beyond conventional static fusion paradigms and integrating multi-task learning, LeMoF significantly enhances model robustness and discriminative capacity in heterogeneous clinical settings. Extensive experiments on ICU length-of-stay prediction demonstrate that LeMoF consistently outperforms state-of-the-art methods across diverse encoder configurations, underscoring the critical role of hierarchical fusion in advancing clinical prediction performance.

Technology Category

Application Category

πŸ“ Abstract
Multimodal clinical prediction is widely used to integrate heterogeneous data such as Electronic Health Records (EHR) and biosignals. However, existing methods tend to rely on static modality integration schemes and simple fusion strategies. As a result, they fail to fully exploit modality-specific representations. In this paper, we propose Level-guided Modal Fusion (LeMoF), a novel framework that selectively integrates level-guided representations within each modality. Each level refers to a representation extracted from a different layer of the encoder. LeMoF explicitly separates and learns global modality-level predictions from level-specific discriminative representations. This design enables LeMoF to achieve a balanced performance between prediction stability and discriminative capability even in heterogeneous clinical environments. Experiments on length of stay prediction using Intensive Care Unit (ICU) data demonstrate that LeMoF consistently outperforms existing state-of-the-art multimodal fusion techniques across various encoder configurations. We also confirmed that level-wise integration is a key factor in achieving robust predictive performance across various clinical conditions.
Problem

Research questions and friction points this paper is trying to address.

multimodal fusion
heterogeneous clinical data
modality-specific representations
clinical prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

level-guided fusion
multimodal learning
heterogeneous clinical data
modality-specific representation
encoder-level integration
πŸ”Ž Similar Papers
No similar papers found.
J
Jongseok Kim
Department of Computer Science, Chungbuk National University, South Korea
S
Seongae Kang
Department of Computer Science, Chungbuk National University, South Korea
J
Jonghwan Shin
Department of Computer Science, Chungbuk National University, South Korea
Y
Yuhan Lee
Brigham and Women’s Hospital, Harvard Medical School, Boston, United States
Ohyun Jo
Ohyun Jo
Professor, Chungbuk National University