Applying Vision Transformers on Spectral Analysis of Astronomical Objects

📅 2025-05-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in astronomical spectral analysis of simultaneously capturing local fine-scale structures (e.g., absorption/emission lines) and global continuum features. We propose adapting image-pretrained Vision Transformers (ViTs) to one-dimensional spectroscopic modeling by reshaping spectral sequences into 2D pseudo-images, enabling ViT’s spatial self-attention mechanism to jointly model both local line profiles and global continuum distributions. Crucially, we directly fine-tune ImageNet-pretrained ViTs on large-scale real astronomical spectra from SDSS and LAMOST—without synthetic data or domain-specific pretraining. This constitutes the first end-to-end application of ViTs to real astrophysical spectra. Our approach significantly improves cross-object-type generalization: stellar classification accuracy surpasses conventional SVM and Random Forest baselines; redshift estimation achieves R² comparable to the state-of-the-art spectral encoder AstroCLIP, while demonstrating superior robustness and scalability.

Technology Category

Application Category

📝 Abstract
We apply pre-trained Vision Transformers (ViTs), originally developed for image recognition, to the analysis of astronomical spectral data. By converting traditional one-dimensional spectra into two-dimensional image representations, we enable ViTs to capture both local and global spectral features through spatial self-attention. We fine-tune a ViT pretrained on ImageNet using millions of spectra from the SDSS and LAMOST surveys, represented as spectral plots. Our model is evaluated on key tasks including stellar object classification and redshift ($z$) estimation, where it demonstrates strong performance and scalability. We achieve classification accuracy higher than Support Vector Machines and Random Forests, and attain $R^2$ values comparable to AstroCLIP's spectrum encoder, even when generalizing across diverse object types. These results demonstrate the effectiveness of using pretrained vision models for spectroscopic data analysis. To our knowledge, this is the first application of ViTs to large-scale, which also leverages real spectroscopic data and does not rely on synthetic inputs.
Problem

Research questions and friction points this paper is trying to address.

Adapting Vision Transformers for astronomical spectral analysis
Enhancing stellar classification and redshift estimation accuracy
Leveraging real spectroscopic data without synthetic inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Apply Vision Transformers to spectral data
Convert 1D spectra to 2D image representations
Fine-tune ViT with SDSS and LAMOST spectra
🔎 Similar Papers
No similar papers found.
L
Luis Felipe Strano Moraes
Harvard Extension School, Harvard University, Cambridge, MA, 02138, USA
I
Ignacio Becker
John A. Paulson School of Engineering and Applied Science, Harvard University, Cambridge, MA, 02138, USA
P
P. Protopapas
John A. Paulson School of Engineering and Applied Science, Harvard University, Cambridge, MA, 02138, USA
Guillermo Cabrera-Vives
Guillermo Cabrera-Vives
Department of Computer Science, University of Concepción
Artificial IntelligenceDeep LearningAstroinformaticsBioinformatics