Contrastive Multi-Task Learning with Solvent-Aware Augmentation for Drug Discovery

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Existing methods struggle to accurately model solvent-dependent protein–ligand conformational changes and lack multi-task collaborative learning capabilities. To address this, we propose a solvent-aware multi-task learning framework: first, constructing molecular conformation ensembles under diverse solvent conditions and learning solvent-invariant representations via contrastive learning; second, jointly pretraining on auxiliary tasks—including molecular reconstruction, atomic distance prediction, and contrastive learning—while integrating heterogeneous solvent-environment data. Our method significantly improves cross-scenario generalization: achieving a 3.7% relative improvement in binding affinity prediction, an 82% success rate on the PoseBusters Astex benchmark, a virtual screening AUC of 97.1%, and a best-docking RMSD of 0.157 Å. The core contribution lies in the first unified integration of explicit solvent modeling, flexible conformational representation, and multi-task pretraining within the drug discovery paradigm.

Technology Category

Application Category

📝 Abstract

Accurate prediction of protein-ligand interactions is essential for computer-aided drug discovery. However, existing methods often fail to capture solvent-dependent conformational changes and lack the ability to jointly learn multiple related tasks. To address these limitations, we introduce a pre-training method that incorporates ligand conformational ensembles generated under diverse solvent conditions as augmented input. This design enables the model to learn both structural flexibility and environmental context in a unified manner. The training process integrates molecular reconstruction to capture local geometry, interatomic distance prediction to model spatial relationships, and contrastive learning to build solvent-invariant molecular representations. Together, these components lead to significant improvements, including a 3.7% gain in binding affinity prediction, an 82% success rate on the PoseBusters Astex docking benchmarks, and an area under the curve of 97.1% in virtual screening. The framework supports solvent-aware, multi-task modeling and produces consistent results across benchmarks. A case study further demonstrates sub-angstrom docking accuracy with a root-mean-square deviation of 0.157 angstroms, offering atomic-level insight into binding mechanisms and advancing structure-based drug design.

Problem

Research questions and friction points this paper is trying to address.

Predict protein-ligand interactions accurately for drug discovery

Capture solvent-dependent conformational changes in ligands

Jointly learn multiple related tasks in drug discovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

Solvent-aware augmentation for diverse ligand conformations

Multi-task learning with contrastive molecular representations

Joint training with reconstruction and distance prediction

🔎 Similar Papers

Advancing Drug Discovery with Enhanced Chemical Understanding via Asymmetric Contrastive Multimodal Learning