🤖 AI Summary
Modeling signed protein–protein interaction (PPI) networks—distinguishing activating versus inhibitory edges—remains challenging due to the functional asymmetry between positive and negative interactions.
Method: We propose a dual latent proximity model that constructs separate embedding spaces for positive and negative interactions, where geometric distances reflect functional similarity. Our framework integrates signed graph neural networks, prototype-based clustering, and Gene Ontology (GO) enrichment analysis to discover interpretable, biologically grounded protein prototypes.
Contribution/Results: We systematically uncover distinct GO functional module enrichments for positive versus negative PPIs—a first-of-its-kind finding. The model achieves statistically significant improvements over baselines in signed PPI link prediction (p < 1e−5), and top-ranked prototypes are validated experimentally. Structural robustness is confirmed via Balanced Normalized Mutual Information (BNMI) evaluation. Crucially, the model disentangles interaction existence from sign prediction with high accuracy, enabling precise functional interpretation of regulatory logic.
📝 Abstract
Accurately predicting complex protein-protein interactions (PPIs) is crucial for decoding biological processes, from cellular functioning to disease mechanisms. However, experimental methods for determining PPIs are computationally expensive. Thus, attention has been recently drawn to machine learning approaches. Furthermore, insufficient effort has been made toward analyzing signed PPI networks, which capture both activating (positive) and inhibitory (negative) interactions. To accurately represent biological relationships, we present the Signed Two-Space Proximity Model (S2-SPM) for signed PPI networks, which explicitly incorporates both types of interactions, reflecting the complex regulatory mechanisms within biological systems. This is achieved by leveraging two independent latent spaces to differentiate between positive and negative interactions while representing protein similarity through proximity in these spaces. Our approach also enables the identification of archetypes representing extreme protein profiles. S2-SPM's superior performance in predicting the presence and sign of interactions in SPPI networks is demonstrated in link prediction tasks against relevant baseline methods. Additionally, the biological prevalence of the identified archetypes is confirmed by an enrichment analysis of Gene Ontology (GO) terms, which reveals that distinct biological tasks are associated with archetypal groups formed by both interactions. This study is also validated regarding statistical significance and sensitivity analysis, providing insights into the functional roles of different interaction types. Finally, the robustness and consistency of the extracted archetype structures are confirmed using the Bayesian Normalized Mutual Information (BNMI) metric, proving the model's reliability in capturing meaningful SPPI patterns.