🤖 AI Summary
Current AI-generated crystal structures suffer from poor dynamical stability, with an average stability rate of only 25.83%; even the best-performing model, MatterGen, achieves just 41.0%, deteriorating further under bandgap or space-group constraints.
Method: We introduce PhononBench—the first large-scale benchmark for phonon-stability evaluation—comprising 108,843 structures generated by six state-of-the-art models. Leveraging the high-accuracy MatterSim interatomic potential and a high-throughput lattice dynamics pipeline, we perform DFT-level phonon calculations across the full Brillouin zone.
Contribution/Results: Our analysis systematically uncovers the intrinsic origins of dynamical instability in generative models. We publicly release 28,119 newly identified phonon-stable structures alongside the complete evaluation workflow. PhononBench establishes a critical, deployable benchmark and provides concrete optimization pathways for robust, physically realistic crystal generation.
📝 Abstract
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals. Leveraging the recently developed MatterSim interatomic potential, which achieves DFT-level accuracy in phonon predictions across more than 10,000 materials, PhononBench enables efficient large-scale phonon calculations and dynamical-stability analysis for 108,843 crystal structures generated by six leading crystal generation models. PhononBench reveals a widespread limitation of current generative models in ensuring dynamical stability: the average dynamical-stability rate across all generated structures is only 25.83%, with the top-performing model, MatterGen, reaching just 41.0%. Further case studies show that in property-targeted generation-illustrated here by band-gap conditioning with MatterGen--the dynamical-stability rate remains as low as 23.5% even at the optimal band-gap condition of 0.5 eV. In space-group-controlled generation, higher-symmetry crystals exhibit better stability (e.g., cubic systems achieve rates up to 49.2%), yet the average stability across all controlled generations is still only 34.4%. An important additional outcome of this study is the identification of 28,119 crystal structures that are phonon-stable across the entire Brillouin zone, providing a substantial pool of reliable candidates for future materials exploration. By establishing the first large-scale dynamical-stability benchmark, this work systematically highlights the current limitations of crystal generation models and offers essential evaluation criteria and guidance for their future development toward the design and discovery of physically viable materials. All model-generated crystal structures, phonon calculation results, and the high-throughput evaluation workflows developed in PhononBench will be openly released at https://github.com/xqh19970407/PhononBench