PhononBench:A Large-Scale Phonon-Based Benchmark for Dynamical Stability in Crystal Generation

📅 2025-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI-generated crystal structures suffer from poor dynamical stability, with an average stability rate of only 25.83%; even the best-performing model, MatterGen, achieves just 41.0%, deteriorating further under bandgap or space-group constraints. Method: We introduce PhononBench—the first large-scale benchmark for phonon-stability evaluation—comprising 108,843 structures generated by six state-of-the-art models. Leveraging the high-accuracy MatterSim interatomic potential and a high-throughput lattice dynamics pipeline, we perform DFT-level phonon calculations across the full Brillouin zone. Contribution/Results: Our analysis systematically uncovers the intrinsic origins of dynamical instability in generative models. We publicly release 28,119 newly identified phonon-stable structures alongside the complete evaluation workflow. PhononBench establishes a critical, deployable benchmark and provides concrete optimization pathways for robust, physically realistic crystal generation.

Technology Category

Application Category

📝 Abstract
In this work, we introduce PhononBench, the first large-scale benchmark for dynamical stability in AI-generated crystals. Leveraging the recently developed MatterSim interatomic potential, which achieves DFT-level accuracy in phonon predictions across more than 10,000 materials, PhononBench enables efficient large-scale phonon calculations and dynamical-stability analysis for 108,843 crystal structures generated by six leading crystal generation models. PhononBench reveals a widespread limitation of current generative models in ensuring dynamical stability: the average dynamical-stability rate across all generated structures is only 25.83%, with the top-performing model, MatterGen, reaching just 41.0%. Further case studies show that in property-targeted generation-illustrated here by band-gap conditioning with MatterGen--the dynamical-stability rate remains as low as 23.5% even at the optimal band-gap condition of 0.5 eV. In space-group-controlled generation, higher-symmetry crystals exhibit better stability (e.g., cubic systems achieve rates up to 49.2%), yet the average stability across all controlled generations is still only 34.4%. An important additional outcome of this study is the identification of 28,119 crystal structures that are phonon-stable across the entire Brillouin zone, providing a substantial pool of reliable candidates for future materials exploration. By establishing the first large-scale dynamical-stability benchmark, this work systematically highlights the current limitations of crystal generation models and offers essential evaluation criteria and guidance for their future development toward the design and discovery of physically viable materials. All model-generated crystal structures, phonon calculation results, and the high-throughput evaluation workflows developed in PhononBench will be openly released at https://github.com/xqh19970407/PhononBench
Problem

Research questions and friction points this paper is trying to address.

Evaluates dynamical stability of AI-generated crystals using large-scale phonon calculations
Reveals low stability rates in current generative models, averaging only 25.83%
Identifies 28,119 phonon-stable structures for future materials discovery and model improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale phonon benchmark for AI-generated crystals
Uses MatterSim potential for DFT-level phonon predictions
Evaluates 108,843 structures from six generative models
Xiao-Qi Han
Xiao-Qi Han
Ph.D. student, Renmin University of China
Artificial IntelligenceMaterials DiscoveryDrug DiscoveryTheoretical Physics
P
Peng-Jie Guo
School of Physics, Renmin University of China, Beijing, China
Z
Ze-Feng Gao
School of Physics, Renmin University of China, Beijing, China
Z
Zhong-Yi Lu
School of Physics, Renmin University of China, Beijing, China