DMol: A Schedule-Driven Diffusion Model for Highly Efficient and Versatile Molecule Generation

📅 2025-04-08

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address the low sampling efficiency, excessive step count, and slow inference of diffusion models in small-molecule generation, this paper proposes a graph-structure-driven scheduled diffusion model. Methodologically, it introduces (1) a novel graph-level progressive noise scheduling mechanism enabling dynamic noise injection at subgraph granularity, and (2) the first integration of molecular ring systems as hyper-nodes directly into the graph diffusion process—eliminating the conventional VAE-based reconstruction step. Evaluated on multiple benchmarks, the method maintains generation quality while reducing sampling steps by over 10× and inference time by 50%, achieving a 1.5% improvement in validity. A lightweight compressed variant further boosts validity by 2% and significantly enhances molecular novelty.

Technology Category

Application Category

📝 Abstract

We introduce a new graph diffusion model for small molecule generation, emph{DMol}, which outperforms the state-of-the-art DiGress model in terms of validity by roughly $1.5%$ across all benchmarking datasets while reducing the number of diffusion steps by at least $10$-fold, and the running time to roughly one half. The performance improvements are a result of a careful change in the objective function and a ``graph noise"scheduling approach which, at each diffusion step, allows one to only change a subset of nodes of varying size in the molecule graph. Another relevant property of the method is that it can be easily combined with junction-tree-like graph representations that arise by compressing a collection of relevant ring structures into supernodes. Unlike classical junction-tree techniques that involve VAEs and require complicated reconstruction steps, compressed DMol directly performs graph diffusion on a graph that compresses only a carefully selected set of frequent carbon rings into supernodes, which results in straightforward sample generation. This compressed DMol method offers additional validity improvements over generic DMol of roughly $2%$, increases the novelty of the method, and further improves the running time due to reductions in the graph size.

Problem

Research questions and friction points this paper is trying to address.

Improves molecule generation validity and efficiency

Reduces diffusion steps and runtime significantly

Enables compressed graph representation for better performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph noise scheduling for selective node changes

Compressed graph diffusion with supernodes

Improved validity and reduced runtime

🔎 Similar Papers

LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models