GAIA: A Foundation Model for Operational Atmospheric Dynamics

📅 2025-05-15
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of learning semantically rich, atmosphere-dynamics-focused representations from geostationary satellite imagery—decoupled from diurnal texture variations—to enhance downstream atmospheric modeling. We propose the first geospatial foundation model architecture that synergistically integrates masked autoencoding (MAE) with label-free self-distillation (DINO), jointly capturing local spatiotemporal details and global dynamical dependencies. The model demonstrates robust reconstruction capability under high masking ratios and achieves high-accuracy precipitation estimation with minimal labeled data: a false alarm rate of 0.088 and structural similarity index of 0.881. Our core contribution lies in deeply adapting self-supervised learning paradigms to physical atmospheric process modeling, establishing a scalable, low-label-dependency framework for global meteorological representation learning.

Technology Category

Application Category

📝 Abstract
We present the GAIA (Geospatial Artificial Intelligence for Atmospheres) Foundation Model, a novel model that combines masked autoencoders (MAE) and self-DIstillation with NO labels (DINO) for analyzing global atmospheric patterns in satellite imagery. By integrating these complementary self-supervised learning approaches, our model simultaneously captures both local features and global dependencies. We address two critical challenges in satellite data analysis: reconstructing missing regions and estimating precipitation patterns as our first downstream tasks. The model demonstrates superior temporal pattern capture compared to standard MAE approaches, while maintaining robust performance in downstream tasks. Our experimental results show strong gap-filling capabilities across varying mask ratios and accurate precipitation estimation with limited training data, achieving a false alarm ratio of 0.088 and structural similarity of 0.881. This work represents an advancement in self-supervised learning for atmospheric science, providing a foundation for improved weather monitoring and climate analysis. The trained model weights and accompanying code are publicly available as open-source on Hugging Face here: https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1.
Problem

Research questions and friction points this paper is trying to address.

Developing hybrid self-supervised model for atmospheric representation learning
Capturing atmospheric dynamics beyond trivial diurnal satellite patterns
Enhancing downstream tasks like cyclone detection and river segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid MAE-DINO model for satellite imagery
Learns disentangled atmospheric dynamics representations
Achieves robust reconstruction with superior gap-filling
A
A. A. Asanjan
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)
O
Olivia Alexander
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)
T
Tom Berg
BCG X AI Science Institute
C
Clara Zhang
BCG X AI Science Institute
M
Matt Yang
BCG X AI Science Institute
J
Jad Makki
BCG X AI Science Institute
D
Disha Shidham
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)
S
Srija Chakraborty
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)
W
William Bender
BCG X AI Science Institute
S
Stephen Peng
BCG X AI Science Institute
Arun Ravindran
Arun Ravindran
University of North Carolina at Charlotte
Computing systemsEdge computingAI
O
Olivier Raiman
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)
D
David T. Potere
BCG X AI Science Institute
D
David Bell
Research Institute for Advanced Computer Science (RIACS) at Universities Research Space Association (USRA)