🤖 AI Summary
To address challenges in chronic liver disease prognosis prediction—including high redundancy among multimodal data (CT images, clinical biomarkers, and radiomic features), weak cross-modal interactions, and inconsistent feature representations—this paper proposes a novel trimodal fusion framework. Methodologically, it introduces an intra-modal aggregation module to enhance unimodal representation learning and a trimodal cross-attention fusion module to enable fine-grained, bidirectional cross-modal interaction. Additionally, a trimodal feature alignment loss is designed to enforce consistency across modalities and promote joint representation learning. Extensive experiments on a dedicated liver disease prognosis dataset demonstrate that the proposed method significantly outperforms unimodal baselines and state-of-the-art multimodal approaches, achieving an absolute AUC improvement of over 8%. The implementation code is publicly available.
📝 Abstract
Chronic liver disease represents a significant health challenge worldwide and accurate prognostic evaluations are essential for personalized treatment plans. Recent evidence suggests that integrating multimodal data, such as computed tomography imaging, radiomic features, and clinical information, can provide more comprehensive prognostic information. However, modalities have an inherent heterogeneity, and incorporating additional modalities may exacerbate the challenges of heterogeneous data fusion. Moreover, existing multimodal fusion methods often struggle to adapt to richer medical modalities, making it difficult to capture inter-modal relationships. To overcome these limitations, We present the Triple-Modal Interaction Chronic Liver Network (TMI-CLNet). Specifically, we develop an Intra-Modality Aggregation module and a Triple-Modal Cross-Attention Fusion module, which are designed to eliminate intra-modality redundancy and extract cross-modal information, respectively. Furthermore, we design a Triple-Modal Feature Fusion loss function to align feature representations across modalities. Extensive experiments on the liver prognosis dataset demonstrate that our approach significantly outperforms existing state-of-the-art unimodal models and other multi-modal techniques. Our code is available at https://github.com/Mysterwll/liver.git.