Latent Diffusion Pretraining for Crystal Property Prediction

📅 2026-05-30

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenge of crystal property prediction, which is hindered by scarce labeled data and limited model generalization. The authors propose CrysLDNet, a novel framework that introduces latent-space diffusion pretraining to this domain for the first time. By jointly pretraining a variational autoencoder and a latent diffusion model on large-scale unlabeled crystal structures, the method learns rich structural and chemical semantic representations. These representations are subsequently fine-tuned with a graph neural network for downstream tasks. CrysLDNet substantially alleviates data scarcity, outperforming existing baselines by 4.26% and 4.90% on the JARVIS and Materials Project datasets, respectively. Moreover, it demonstrates robust performance under low-data regimes and in scenarios requiring experimental correction, effectively leveraging minimal experimental data to calibrate DFT-predicted errors.

📝 Abstract

Fast and accurate prediction of crystal properties is a central challenge in new materials design. Graph neural networks and Transformer-based models have emerged as powerful tools for this task due to their ability to encode the local structural environment of atoms within a crystal. However, these models are data-hungry, and in practice, labeled data for crystal properties are scarce. Pretraining-finetuning strategies, particularly those based on diffusion models, have shown promise in addressing these limitations. In this work, we introduce a novel latent diffusion based pretraining framework, CrysLDNet, designed to mitigate data scarcity. Our approach integrates a Variational Autoencoder (VAE) with a diffusion model during the pretraining stage. The VAE encoder maps 3D crystal structures into a smooth latent space within which the diffusion process is applied. This latent diffusion pretraining enables the graph encoder to effectively capture structural and chemical semantics from large-scale unlabeled data, which can then be finetuned for specific property prediction tasks. Comprehensive experiments on popular DFT datasets for property prediction reveal that CrysLDNet significantly outperforms both training-from-scratch and pretrained baselines, with improvements of 4.26% and 4.90% on the JARVIS and MP datasets, respectively. Additionally, the learned representations remain robust in sparse-data conditions and are expressive enough to correct DFT errors when finetuned with limited experimental data. Code is available at: https://github.com/shrimonmuke0202/CrysLDNet.git.

Problem

Research questions and friction points this paper is trying to address.

crystal property prediction

data scarcity

pretraining

diffusion models

materials design

Innovation

Methods, ideas, or system contributions that make the work stand out.

latent diffusion

pretraining

crystal property prediction