🤖 AI Summary
This study addresses the limited generalizability of the meteorological foundation model Aurora to hydrological variables unseen during pretraining. To overcome this, we propose an efficient adaptation strategy that freezes Aurora’s backbone and appends a lightweight feed-forward decoder. Unlike full-model fine-tuning, our method trains only the shallow decoder while leveraging frozen backbone-extracted latent representations for cross-variable transfer—demonstrating, for the first time, that Aurora’s latent space encodes physically consistent, transferable information. Experiments across multiple novel hydrological variables show that our approach achieves prediction accuracy comparable to full fine-tuning, while reducing training time by 50% and GPU memory consumption by 35%, all without compromising autoregressive stability. This work establishes a new paradigm for resource-efficient extension of foundation models in computationally constrained settings.
📝 Abstract
Recent advances in AI weather forecasting have led to the emergence of so-called "foundation models", typically defined by expensive pretraining and minimal fine-tuning for downstream tasks. However, in the natural sciences, a desirable foundation model should also encode meaningful statistical relationships between the underlying physical variables. This study evaluates the performance of the state-of-the-art Aurora foundation model in predicting hydrological variables, which were not considered during pretraining. We introduce a lightweight approach using shallow decoders trained on the latent representations of the pretrained model to predict these new variables. As a baseline, we compare this to fine-tuning the full model, which allows further optimization of the latent space while incorporating new variables into both inputs and outputs. The decoder-based approach requires 50% less training time and 35% less memory, while achieving strong accuracy across various hydrological variables and preserving desirable properties of the foundation model, such as autoregressive stability. Notably, decoder accuracy depends on the physical correlation between the new variables and those used during pretraining, indicating that Aurora's latent space captures meaningful physical relationships. In this sense, we argue that an important quality metric for foundation models in Earth sciences is their ability to be extended to new variables without a full fine-tuning. This provides a new perspective for making foundation models more accessible to communities with limited computational resources, while supporting broader adoption in Earth sciences.