π€ AI Summary
Traditional molecular generation methods are constrained by the limited expressiveness of sequence- or graph-based representations and difficulties in enforcing equivariance. This paper introduces the first function-representation-based paradigm for molecular generation, modeling molecules as input-output mapping functions and constructing a diffusion model directly in function spaceβjointly denoising both domain and codomain. Our approach integrates implicit neural representations (INRs) for continuous, function-level modeling and proposes an EM-optimized joint denoising mechanism that balances modeling fidelity and architectural simplicity. On multiple benchmark datasets, our method surpasses state-of-the-art data-space approaches: it significantly improves chemical validity of generated molecules, accelerates inference by 2.1β3.8Γ, and reduces parameter count by 37%β59%. These results empirically validate the effectiveness and efficiency of function-space modeling for molecular generation.
π Abstract
Traditional molecule generation methods often rely on sequence- or graph-based representations, which can limit their expressive power or require complex permutation-equivariant architectures. This paper introduces a novel paradigm for learning molecule generative models based on functional representations. Specifically, we propose Molecular Implicit Neural Generation (MING), a diffusion-based model that learns molecular distributions in the function space. Unlike standard diffusion processes in the data space, MING employs a novel functional denoising probabilistic process, which jointly denoises information in both the function's input and output spaces by leveraging an expectation-maximization procedure for latent implicit neural representations of data. This approach enables a simple yet effective model design that accurately captures underlying function distributions. Experimental results on molecule-related datasets demonstrate MING's superior performance and ability to generate plausible molecular samples, surpassing state-of-the-art data-space methods while offering a more streamlined architecture and significantly faster generation times. The code is available at https://github.com/v18nguye/MING.