BoKDiff: Best-of-K Diffusion Alignment for Target-Specific 3D Molecule Generation

📅 2025-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current structure-based drug design (SBDD) faces two key challenges: low geometric accuracy in ligand–protein 3D alignment and scarcity of high-quality protein–ligand complex data. To address these, we propose the first SBDD framework integrating Best-of-K alignment with Best-of-N (BoN) sampling, coupled with centroid-based repositioning to enable fine-grained, finetuning-free substructure extraction. Built upon the DecompDiff diffusion model, our method incorporates multi-objective optimization, geometric deep learning, molecular property–weighted ranking (QED, SA, and AutoDock Vina scores), and BoN sampling. On the CrossDocked2020 benchmark, it achieves state-of-the-art performance: mean Vina score of −8.58, baseline generation success rate of 26%, and—under BoN sampling with QED > 0.6 and SA > 0.75—a success rate of 35.2%.

Technology Category

Application Category

📝 Abstract
Structure-based drug design (SBDD) leverages the 3D structure of biomolecular targets to guide the creation of new therapeutic agents. Recent advances in generative models, including diffusion models and geometric deep learning, have demonstrated promise in optimizing ligand generation. However, the scarcity of high-quality protein-ligand complex data and the inherent challenges in aligning generated ligands with target proteins limit the effectiveness of these methods. We propose BoKDiff, a novel framework that enhances ligand generation by combining multi-objective optimization and Best-of-K alignment methodologies. Built upon the DecompDiff model, BoKDiff generates diverse candidates and ranks them using a weighted evaluation of molecular properties such as QED, SA, and docking scores. To address alignment challenges, we introduce a method that relocates the center of mass of generated ligands to their docking poses, enabling accurate sub-component extraction. Additionally, we integrate a Best-of-N (BoN) sampling approach, which selects the optimal ligand from multiple generated candidates without requiring fine-tuning. BoN achieves exceptional results, with QED values exceeding 0.6, SA scores above 0.75, and a success rate surpassing 35%, demonstrating its efficiency and practicality. BoKDiff achieves state-of-the-art results on the CrossDocked2020 dataset, including a -8.58 average Vina docking score and a 26% success rate in molecule generation. This study is the first to apply Best-of-K alignment and Best-of-N sampling to SBDD, highlighting their potential to bridge generative modeling with practical drug discovery requirements. The code is provided at https://github.com/khodabandeh-ali/BoKDiff.git.
Problem

Research questions and friction points this paper is trying to address.

Drug Design
Protein Alignment
3D Molecular Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

BoKDiff
Multi-objective Optimization
Best-of-K Alignment
🔎 Similar Papers
No similar papers found.
A
Ali Khodabandeh Yalabadi
Department of Industrial Engineering and Management Systems, University of Central Florida, 4000 Central Florida Blvd, 32816, FL, USA
Mehdi Yazdani-Jahromi
Mehdi Yazdani-Jahromi
University of Central Florida
artificial intelligencecomputational drug discoveryalgorithmic fairness
Ozlem Ozmen Garibay
Ozlem Ozmen Garibay
Assistant Professor, University of Central Florida