Spectra as Language: Large Language Models for Scalable Stellar Parameter and Abundance Inference

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
Traditional approaches to analyzing large-scale stellar spectra are hindered by high-dimensional data, limited generalization, and low computational efficiency. This work introduces large language models (LLMs) into stellar spectral analysis for the first time, proposing a two-stage, scalable sequence modeling framework that treats spectra as continuous signals to enable end-to-end prediction of stellar effective temperature, surface gravity, metallicity, and abundances of approximately 20 chemical elements. By fully leveraging the sequential nature of spectral data, the method achieves high-precision estimates on large datasets, and scaling law analysis demonstrates that its performance systematically improves with increasing data volume. This study establishes an efficient and scalable new paradigm for astronomical spectral analysis.
📝 Abstract
Stellar spectra encode key information on the physical properties and chemical compositions of stars. Accurate stellar parameter determination is essential for addressing major questions such as galaxy and stellar evolution. Large-scale spectroscopic surveys have accumulated unprecedented spectral data. Traditional feature extraction or model-fitting approaches struggle with high-dimensional, massive datasets, limited generalization, and computational inefficiency. Recent advances in large language models demonstrate strong generalization and feature-learning in tasks like natural language processing, DNA/RNA sequence analysis, and protein/chemical parsing. Stellar spectra are continuous sequential signals, enabling the transfer of language models to stellar spectroscopy. Here, we propose a two-stage large language model framework for stellar parameter inference, achieving accurate estimation of effective temperature, surface gravity, metallicity, and abundances of ~20 chemical elements. Scaling-law analyses show systematic performance improvements with increasing data, providing a scalable framework for forthcoming large-scale surveys.
Problem

Research questions and friction points this paper is trying to address.

stellar spectra
stellar parameter inference
chemical abundances
large-scale spectroscopic surveys
high-dimensional data
Innovation

Methods, ideas, or system contributions that make the work stand out.

large language models
stellar spectroscopy
stellar parameter inference
scaling laws
spectral analysis
H
Hai-Ling Lu
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
Y
Yu-Yang Li
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; School of Astronomy and Space Science, University of Chinese Academy of Sciences, Beijing 100049, China
Y
Yin-Bi Li
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
C
Cun-Shi Wang
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; School of Astronomy and Space Science, University of Chinese Academy of Sciences, Beijing 100049, China
A
A-Li Luo
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; School of Astronomy and Space Science, University of Chinese Academy of Sciences, Beijing 100049, China; University of Chinese Academy of Sciences, Nanjing 211135, China
J
Jun-Chao Liang
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
Shuo Li
Shuo Li
University of Chinese Academy of Sciences, Institute of Software Chinese Academy of Sciences
Software Evolution and MaintenanceSoftware testing