Data-Driven Spectral Prediction for Accelerating Large-Scale Electronic Structure Calculations

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

181K/year
🤖 AI Summary
This work addresses the high computational cost of generalized eigenvalue problems in large-scale density functional theory calculations on exascale architectures by introducing a data-driven framework that reformulates spectral prediction as a regression task for coefficients of Chebyshev interpolating polynomials. Combining all-atom and fragmented structural representations, the approach leverages kernel ridge regression, graph neural networks, and random forest models trained on a 2 TB dataset of protein dimers. Integrated into the BigDFT software package, it provides high-quality initial guesses that significantly accelerate the early stages of self-consistent field iterations. This innovation overcomes the dimensional limitations of conventional methods and establishes a foundation for dynamic optimization of rational filter eigensolvers such as FrASE.
📝 Abstract
Simulating large molecular systems comprising thousands of atoms requires highly scalable methodologies. While modern Density Functional Theory (DFT) codes exhibit linear scaling, solving the associated large, sparse generalized eigenproblems remains a critical computational bottleneck on exascale architectures. In the context of the LimitX project, we propose a data-driven framework to accelerate these calculations. By shifting the machine learning target from discrete eigenvalues to the coefficients of an interpolating Chebyshev polynomial, and by comparing both all-atom and fragment-based structural representations, we successfully overcome the dimensionality constraints of large-scale spectral prediction. We investigate three machine learning models (Kernel Ridge Regression, Graph Neural Networks, and Random Forests) trained on a novel 2 TB dataset of protein dimers. The predicted spectra provide initial guesses that effectively bypass early Self-Consistent Field (SCF) iterations in BigDFT. Ultimately, these spectral predictors will be deployed to dynamically optimize upcoming rational filter-based eigensolvers, such as FrASE, which is currently in initial development.
Problem

Research questions and friction points this paper is trying to address.

electronic structure calculations
generalized eigenproblems
spectral prediction
Density Functional Theory
exascale computing
Innovation

Methods, ideas, or system contributions that make the work stand out.

data-driven spectral prediction
Chebyshev polynomial interpolation
fragment-based representation
machine learning for DFT
eigensolver acceleration
🔎 Similar Papers
No similar papers found.
💼 Related Jobs
A
Abhiram Badrinarayanan
Ručer Bošković Institute, Croatia
Davor Davidović
Davor Davidović
Ruđer Bošković Institute
Numerical linear algebraGPU computinghybrid computingparallel programming
Edoardo Di Napoli
Edoardo Di Napoli
Lead Scientist, Juelich Supercomputing Centre
computational physicsnumerical linear algebrascientific computinghigh-performance computingtheoretical physics
J
Jurica Novak
Ručer Bošković Institute, Croatia
Luigi Genovese
Luigi Genovese
CEA / IRIG / MEM / L_Sim, F-38000 Grenoble, France
G
Gustavo Ramirez-Hidalgo
Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany
X
Xinzhe Wu
Jülich Supercomputing Centre, Forschungszentrum Jülich, Germany