🤖 AI Summary
To address the poor scalability and low training/sampling efficiency of SE(3)-equivariant diffusion models in 3D molecular generation, this paper proposes AlignDiff—a computationally efficient generative framework that dispenses with explicit equivariant architectures. Its core innovation is a sample-dependent SO(3) rotational alignment mechanism that pre-aligns molecular conformations to a canonical orientation, thereby constructing a geometry-normalized aligned latent space. A lightweight non-equivariant U-Net diffusion model is then trained exclusively in this aligned space. This design constitutes the first integration of rotational alignment with non-equivariant modeling, effectively decoupling generation from SE(3)-equivariance constraints. Experiments on QM9 and GEOM-DRUGS demonstrate that AlignDiff achieves generation quality competitive with state-of-the-art equivariant models (with only a 12% drop in FCD), while accelerating training by 2.3× and sampling by 1.8×. The code is publicly available.
📝 Abstract
Equivariant diffusion models have achieved impressive performance in 3D molecule generation. These models incorporate Euclidean symmetries of 3D molecules by utilizing an SE(3)-equivariant denoising network. However, specialized equivariant architectures limit the scalability and efficiency of diffusion models. In this paper, we propose an approach that relaxes such equivariance constraints. Specifically, our approach learns a sample-dependent SO(3) transformation for each molecule to construct an aligned latent space. A non-equivariant diffusion model is then trained over the aligned representations. Experimental results demonstrate that our approach performs significantly better than previously reported non-equivariant models. It yields sample quality comparable to state-of-the-art equivariant diffusion models and offers improved training and sampling efficiency. Our code is available at https://github.com/skeletondyh/RADM