🤖 AI Summary
Protein–protein interaction (PPI) prediction faces challenges including high experimental costs, weak cross-modal feature fusion, high false-negative rates, and insufficient model robustness. To address these, we propose the first supervised contrastive learning–driven multimodal PPI prediction framework, integrating sequence representations—AAC, DPC, and CKSAAP-ESMC—with Node2Vec-based topological network embeddings. We introduce a novel negative-sample filtering mechanism and an improved contrastive loss function to enable cross-modal协同 optimization. Our method achieves 98.01% accuracy and 99.62% AUC across eight benchmark datasets, and exceeds 99% AUC in cross-species prediction—substantially outperforming state-of-the-art approaches. Furthermore, biological validation on CD9 and Wnt signaling pathways, as well as cancer-specific target discovery, demonstrates strong interpretability and generalizability.
📝 Abstract
Protein-Protein Interaction (PPI) prediction is a key task in uncovering cellular functional networks and disease mechanisms. However, traditional experimental methods are time-consuming and costly, and existing computational models face challenges in cross-modal feature fusion, robustness, and false-negative suppression. In this paper, we propose a novel supervised contrastive multimodal framework, SCMPPI, for PPI prediction. By integrating protein sequence features (AAC, DPC, CKSAAP-ESMC) with PPI network topology information (Node2Vec graph embedding), and combining an improved supervised contrastive learning strategy, SCMPPI significantly enhances PPI prediction performance. For the PPI task, SCMPPI introduces a negative sample filtering mechanism and modifies the contrastive loss function, effectively optimizing multimodal features. Experiments on eight benchmark datasets, including yeast, human, and H.pylori, show that SCMPPI outperforms existing state-of-the-art methods (such as DF-PPI and TAGPPI) in key metrics such as accuracy ( 98.01%) and AUC (99.62%), and demonstrates strong generalization in cross-species prediction (AUC>99% on multi-species datasets). Furthermore, SCMPPI has been successfully applied to CD9 networks, the Wnt pathway, and cancer-specific networks, providing a reliable tool for disease target discovery. This framework also offers a new paradigm for multimodal biological information fusion and contrastive learning in collaborative optimization for various combined predictions.