MolSculpt: Sculpting 3D Molecular Geometries from Chemical Syntax

📅 2025-12-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the semantic gap between one-dimensional molecular representations (e.g., SMILES/SELFIES) and three-dimensional geometric generation. To bridge this gap, we propose a novel “grammar-driven sculpting” paradigm. Methodologically, we introduce the first end-to-end injection of frozen 1D chemical foundation model knowledge—such as a SELFIES encoder—into the conditional space of a 3D diffusion model, without fine-tuning the 1D model. This is achieved via a learnable chemical knowledge query module and a cross-modal projector that precisely maps chemical priors encoded in the 1D representation onto the 3D conformational generation process. Evaluated on GEOM-DRUGS and QM9, our approach achieves state-of-the-art performance in 3D molecular generation, significantly improving geometric accuracy (reduced bond, angle, and dihedral errors), conformational diversity (↑ Coverage), and sampling stability (↑ Validity), thereby effectively closing the semantic gap between 1D syntactic representations and 3D spatial configurations.

Technology Category

Application Category

📝 Abstract
Generating precise 3D molecular geometries is crucial for drug discovery and material science. While prior efforts leverage 1D representations like SELFIES to ensure molecular validity, they fail to fully exploit the rich chemical knowledge entangled within 1D models, leading to a disconnect between 1D syntactic generation and 3D geometric realization. To bridge this gap, we propose MolSculpt, a novel framework that "sculpts" 3D molecular geometries from chemical syntax. MolSculpt is built upon a frozen 1D molecular foundation model and a 3D molecular diffusion model. We introduce a set of learnable queries to extract inherent chemical knowledge from the foundation model, and a trainable projector then injects this cross-modal information into the conditioning space of the diffusion model to guide the 3D geometry generation. In this way, our model deeply integrates 1D latent chemical knowledge into the 3D generation process through end-to-end optimization. Experiments demonstrate that MolSculpt achieves state-of-the-art (SOTA) performance in extit{de novo} 3D molecule generation and conditional 3D molecule generation, showing superior 3D fidelity and stability on both the GEOM-DRUGS and QM9 datasets. Code is available at https://github.com/SakuraTroyChen/MolSculpt.
Problem

Research questions and friction points this paper is trying to address.

Bridges 1D chemical syntax and 3D geometry generation
Extracts and injects chemical knowledge into 3D diffusion models
Achieves state-of-the-art de novo and conditional 3D molecule generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sculpts 3D geometries from chemical syntax
Uses frozen 1D foundation and 3D diffusion models
Integrates cross-modal knowledge via learnable queries
Zhanpeng Chen
Zhanpeng Chen
Peking University
Vision-language Model
Weihao Gao
Weihao Gao
Moonshot AI
Machine LearningDeep LearningInformation Theory
S
Shunyu Wang
AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
Y
Yanan Zhu
AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China; Faculty of Materials Science, Shenzhen MSU-BIT University, Shenzhen, China
H
Hong Meng
AI for Science (AI4S)-Preferred Program, Peking University Shenzhen Graduate School, China
Yuexian Zou
Yuexian Zou
Peking University Shenzhen Graduate School
Machine LearningSpeech ProcessingImage Processing