Re-envisioning Euclid Galaxy Morphology: Identifying and Interpreting Features with Sparse Autoencoders

๐Ÿ“… 2025-10-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study addresses the challenge of efficiently identifying and interpreting morphology-related, human-interpretable features in Euclid Q1 galaxy imagesโ€”features that extend beyond the Galaxy Zoo decision-tree framework and are embedded within pre-trained neural networks. Method: We propose a feature disentanglement approach based on sparse autoencoders (SAEs), jointly leveraging the supervised Zoobot model and self-supervised masked autoencoding (MAE) to extract unambiguous, semantically meaningful galaxy morphology representations from Euclid Q1 data. Contribution/Results: Compared to conventional dimensionality-reduction methods (e.g., PCA), SAE-learned features exhibit significantly higher alignment with Galaxy Zoo labels and uncover novel, previously undefined astronomical structural patterns. The released MAE model achieves superhuman image reconstruction performance. To our knowledge, this work constitutes the first systematic effort to mine interpretable galaxy morphology features from pre-trained vision models, establishing a new paradigm for intelligent analysis of astronomical imagery.

Technology Category

Application Category

๐Ÿ“ Abstract
Sparse Autoencoders (SAEs) can efficiently identify candidate monosemantic features from pretrained neural networks for galaxy morphology. We demonstrate this on Euclid Q1 images using both supervised (Zoobot) and new self-supervised (MAE) models. Our publicly released MAE achieves superhuman image reconstruction performance. While a Principal Component Analysis (PCA) on the supervised model primarily identifies features already aligned with the Galaxy Zoo decision tree, SAEs can identify interpretable features outside of this framework. SAE features also show stronger alignment than PCA with Galaxy Zoo labels. Although challenges in interpretability remain, SAEs provide a powerful engine for discovering astrophysical phenomena beyond the confines of human-defined classification.
Problem

Research questions and friction points this paper is trying to address.

Identifying galaxy morphology features using sparse autoencoders
Comparing supervised and self-supervised models for feature extraction
Discovering astrophysical phenomena beyond human-defined classification systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Autoencoders identify features from neural networks
SAEs find interpretable features beyond human classification
SAEs align better with Galaxy Zoo labels than PCA