๐ค AI Summary
This work addresses the scarcity of high-quality, reproducible, and open-source machine learning interatomic potential (MLIP) databases for molecular crystals, which has hindered their application in polymorph and thermodynamic simulations. We present the first open MLIP database specifically designed for molecular crystals, built upon the MACE-MH-1 pretrained model and an automated machine learning pipeline (AMLP). Covering nine representative chemical systems, our framework encompasses the entire workflowโfrom generation of high-fidelity reference data to model fine-tuning and validation. The resulting models achieve mean absolute errors of 0.141 kJ/mol/atom for energy and 0.648 kJ/mol/ร
for forces, demonstrating excellent energy conservation and structural stability in molecular dynamics simulations. This approach significantly enhances model generalizability, development efficiency, and reproducibility, enabling efficient simulation of polymorphic behavior under diverse thermodynamic conditions.
๐ Abstract
We present an open Molecular Crystal (MC) database of Machine-Learned Interatomic Potentials (MLIP) called MolCryst-MLIPs. The first release comprises fine-tuned MACE models for nine molecular crystal systems -- Benzamide, Benzoic acid, Coumarin, Durene, Isonicotinamide, Niacinamide, Nicotinamide, Pyrazinamide, and Resorcinol -- developed using the Automated Machine Learning Pipeline (AMLP), which streamlines the entire MLIP development workflow, from reference data generation to model training and validation, into a reproducible and user-friendly pipeline. Models are fine-tuned from the MACE-MH-1 foundation model (omol head), yielding a mean energy MAE of 0.141 kJ/mol/atom and a mean force MAE of 0.648 kJ/mol/Angstrom across all systems. Dynamical stability and structural integrity, as assessed through energy conservation, P2 orientational order parameters, and radial distribution functions, are evaluated using molecular dynamics simulations. The released models and datasets constitute a growing open database of validated MLIPs, ready for production MD simulations of molecular crystal polymorphism under different thermodynamic conditions.