Contrastive Multi-Task Learning with Solvent-Aware Augmentation for Drug Discovery

📅 2025-08-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods struggle to accurately model solvent-dependent protein–ligand conformational changes and lack multi-task collaborative learning capabilities. To address this, we propose a solvent-aware multi-task learning framework: first, constructing molecular conformation ensembles under diverse solvent conditions and learning solvent-invariant representations via contrastive learning; second, jointly pretraining on auxiliary tasks—including molecular reconstruction, atomic distance prediction, and contrastive learning—while integrating heterogeneous solvent-environment data. Our method significantly improves cross-scenario generalization: achieving a 3.7% relative improvement in binding affinity prediction, an 82% success rate on the PoseBusters Astex benchmark, a virtual screening AUC of 97.1%, and a best-docking RMSD of 0.157 Å. The core contribution lies in the first unified integration of explicit solvent modeling, flexible conformational representation, and multi-task pretraining within the drug discovery paradigm.

Technology Category

Application Category

📝 Abstract
Accurate prediction of protein-ligand interactions is essential for computer-aided drug discovery. However, existing methods often fail to capture solvent-dependent conformational changes and lack the ability to jointly learn multiple related tasks. To address these limitations, we introduce a pre-training method that incorporates ligand conformational ensembles generated under diverse solvent conditions as augmented input. This design enables the model to learn both structural flexibility and environmental context in a unified manner. The training process integrates molecular reconstruction to capture local geometry, interatomic distance prediction to model spatial relationships, and contrastive learning to build solvent-invariant molecular representations. Together, these components lead to significant improvements, including a 3.7% gain in binding affinity prediction, an 82% success rate on the PoseBusters Astex docking benchmarks, and an area under the curve of 97.1% in virtual screening. The framework supports solvent-aware, multi-task modeling and produces consistent results across benchmarks. A case study further demonstrates sub-angstrom docking accuracy with a root-mean-square deviation of 0.157 angstroms, offering atomic-level insight into binding mechanisms and advancing structure-based drug design.
Problem

Research questions and friction points this paper is trying to address.

Predict protein-ligand interactions accurately for drug discovery
Capture solvent-dependent conformational changes in ligands
Jointly learn multiple related tasks in drug discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Solvent-aware augmentation for diverse ligand conformations
Multi-task learning with contrastive molecular representations
Joint training with reconstruction and distance prediction
🔎 Similar Papers
No similar papers found.
J
Jing Lan
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
H
Hexiao Ding
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
H
Hongzhao Chen
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
Y
Yufeng Jiang
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
N
Ng Nga Chun
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China, Department of Nuclear Medicine and PET, Hong Kong Sanatorium and Hospital, Hong Kong SAR, China
G
Gerald W. Y. Cheng
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
Zongxi Li
Zongxi Li
School of Data Science, Lingnan University
natural language processingsentiment analysiseducation technology
J
Jing Cai
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
L
Liang-ting Lin
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
J
Jung Sun Yoo
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China