🤖 AI Summary
This work addresses the lack of a unified differentiable framework for data-driven modeling of molecular crystals. We introduce MolCryst—the first end-to-end differentiable, modular, and CUDA-accelerated machine learning toolkit for molecular crystals. Methodologically, it integrates molecular crystal dataset construction, geometry-aware parametric representation, differentiable structural sampling and optimization, and efficient training/inference—enabling joint differentiable modeling of crystal structure generation and physical property prediction. Our contributions are threefold: (1) an open-source, high-throughput computational framework supporting flexible composition and plug-and-play integration; (2) significantly improved modeling efficiency and accuracy, with full compatibility with mainstream deep learning ecosystems (e.g., PyTorch); and (3) comprehensive documentation, tutorials, and benchmarked examples—publicly released on GitHub and validated across diverse solid-state molecular systems.
📝 Abstract
We present MXtalTools, a flexible Python package for the data-driven modelling of molecular crystals, facilitating machine learning studies of the molecular solid state. MXtalTools comprises several classes of utilities: (1) synthesis, collation, and curation of molecule and crystal datasets, (2) integrated workflows for model training and inference, (3) crystal parameterization and representation, (4) crystal structure sampling and optimization, (5) end-to-end differentiable crystal sampling, construction and analysis. Our modular functions can be integrated into existing workflows or combined and used to build novel modelling pipelines. MXtalTools leverages CUDA acceleration to enable high-throughput crystal modelling. The Python code is available open-source on our GitHub page, with detailed documentation on ReadTheDocs.