MolSculpt: Sculpting 3D Molecular Geometries from Chemical Syntax

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the semantic gap between one-dimensional molecular representations (e.g., SMILES/SELFIES) and three-dimensional geometric generation. To bridge this gap, we propose a novel “grammar-driven sculpting” paradigm. Methodologically, we introduce the first end-to-end injection of frozen 1D chemical foundation model knowledge—such as a SELFIES encoder—into the conditional space of a 3D diffusion model, without fine-tuning the 1D model. This is achieved via a learnable chemical knowledge query module and a cross-modal projector that precisely maps chemical priors encoded in the 1D representation onto the 3D conformational generation process. Evaluated on GEOM-DRUGS and QM9, our approach achieves state-of-the-art performance in 3D molecular generation, significantly improving geometric accuracy (reduced bond, angle, and dihedral errors), conformational diversity (↑ Coverage), and sampling stability (↑ Validity), thereby effectively closing the semantic gap between 1D syntactic representations and 3D spatial configurations.

Technology Category

Application Category

📝 Abstract

Generating precise 3D molecular geometries is crucial for drug discovery and material science. While prior efforts leverage 1D representations like SELFIES to ensure molecular validity, they fail to fully exploit the rich chemical knowledge entangled within 1D models, leading to a disconnect between 1D syntactic generation and 3D geometric realization. To bridge this gap, we propose MolSculpt, a novel framework that "sculpts" 3D molecular geometries from chemical syntax. MolSculpt is built upon a frozen 1D molecular foundation model and a 3D molecular diffusion model. We introduce a set of learnable queries to extract inherent chemical knowledge from the foundation model, and a trainable projector then injects this cross-modal information into the conditioning space of the diffusion model to guide the 3D geometry generation. In this way, our model deeply integrates 1D latent chemical knowledge into the 3D generation process through end-to-end optimization. Experiments demonstrate that MolSculpt achieves state-of-the-art (SOTA) performance in extit{de novo} 3D molecule generation and conditional 3D molecule generation, showing superior 3D fidelity and stability on both the GEOM-DRUGS and QM9 datasets. Code is available at https://github.com/SakuraTroyChen/MolSculpt.

Problem

Research questions and friction points this paper is trying to address.

Bridges 1D chemical syntax and 3D geometry generation

Extracts and injects chemical knowledge into 3D diffusion models

Achieves state-of-the-art de novo and conditional 3D molecule generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sculpts 3D geometries from chemical syntax

Uses frozen 1D foundation and 3D diffusion models

Integrates cross-modal knowledge via learnable queries

🔎 Similar Papers

3D-MolT5: Towards Unified 3D Molecule-Text Modeling with 3D Molecular Tokenization