Multimodal Contrastive Representation Learning in Augmented Biomedical Knowledge Graphs

๐Ÿ“… 2025-01-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Link prediction in biomedical knowledge graphs (BKGs) remains challenging due to sparse, heterogeneous entity representations and limited generalization to unseen nodes. Method: This paper proposes a multimodal collaborative modeling framework that jointly integrates language modelโ€“derived semantic representations with graph contrastive learning. It unifies multimodal contrastive learning with knowledge graph embedding, jointly encoding intra-entity multimodal features (e.g., biological sequences and textual descriptions) and inter-entity topological structures. We further construct PrimeKG++, an enhanced BKG explicitly designed to support zero-shot node link prediction. Contribution/Results: Our approach achieves significant improvements in link prediction performance on both PrimeKG++ and DrugBank benchmarks, demonstrating strong cross-task and cross-domain generalization. All code, pretrained models, and datasets are publicly released to foster reproducibility and community advancement.

Technology Category

Application Category

๐Ÿ“ Abstract
Biomedical Knowledge Graphs (BKGs) integrate diverse datasets to elucidate complex relationships within the biomedical field. Effective link prediction on these graphs can uncover valuable connections, such as potential novel drug-disease relations. We introduce a novel multimodal approach that unifies embeddings from specialized Language Models (LMs) with Graph Contrastive Learning (GCL) to enhance intra-entity relationships while employing a Knowledge Graph Embedding (KGE) model to capture inter-entity relationships for effective link prediction. To address limitations in existing BKGs, we present PrimeKG++, an enriched knowledge graph incorporating multimodal data, including biological sequences and textual descriptions for each entity type. By combining semantic and relational information in a unified representation, our approach demonstrates strong generalizability, enabling accurate link predictions even for unseen nodes. Experimental results on PrimeKG++ and the DrugBank drug-target interaction dataset demonstrate the effectiveness and robustness of our method across diverse biomedical datasets. Our source code, pre-trained models, and data are publicly available at https://github.com/HySonLab/BioMedKG
Problem

Research questions and friction points this paper is trying to address.

Biomedical Knowledge Graphs
Text and Graph Integration
Drug-Disease Prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated Text and Graph Information
Predictive Modeling for Drug-Disease Associations
Comprehensive Biomedical Knowledge Graph PrimeKG++
๐Ÿ”Ž Similar Papers
No similar papers found.