🤖 AI Summary
Cancer progression involves multi-scale biological mechanisms that cannot be fully captured by histopathological images alone. To address this, we propose the first multimodal foundation model integrating whole-slide imaging (WSI), genomics, epigenomics, and transcriptomics (RNA-seq) to construct patient-level unified biological representations. Methodologically, we introduce a multimodal SigLIP contrastive loss, fragment-aware rotary position encoding (F-RoPE), and domain-specific foundational modules for WSI and RNA-seq—enabling biology-informed cross-modal alignment. Evaluated on Patho-Bench—a comprehensive benchmark comprising 80 downstream tasks—our model achieves state-of-the-art performance. It demonstrates superior generalizability and adaptability on real-world clinical data, while maintaining high data and parameter efficiency. This work establishes a scalable, biologically grounded framework for integrative computational pathology.
📝 Abstract
Cancer progression arises from interactions across multiple biological layers, especially beyond morphological and across molecular layers that remain invisible to image-only models. To capture this broader biological landscape, we present EXAONE Path 2.5, a pathology foundation model that jointly models histologic, genomic, epigenetic and transcriptomic modalities, producing an integrated patient representation that reflects tumor biology more comprehensively. Our approach incorporates three key components: (1) multimodal SigLIP loss enabling all-pairwise contrastive learning across heterogeneous modalities, (2) a fragment-aware rotary positional encoding (F-RoPE) module that preserves spatial structure and tissue-fragment topology in WSI, and (3) domain-specialized internal foundation models for both WSI and RNA-seq to provide biologically grounded embeddings for robust multimodal alignment. We evaluate EXAONE Path 2.5 against six leading pathology foundation models across two complementary benchmarks: an internal real-world clinical dataset and the Patho-Bench benchmark covering 80 tasks. Our framework demonstrates high data and parameter efficiency, achieving on-par performance with state-of-the-art foundation models on Patho-Bench while exhibiting the highest adaptability in the internal clinical setting. These results highlight the value of biologically informed multimodal design and underscore the potential of integrated genotype-to-phenotype modeling for next-generation precision oncology.