🤖 AI Summary
Existing weather forecasting methods struggle to simultaneously achieve high accuracy, rigorous uncertainty quantification, and computational efficiency: Numerical Weather Prediction (NWP) is computationally prohibitive, while mainstream machine learning–based weather prediction (MLWP) approaches lack probabilistic modeling capabilities. To address this, we propose CoDiCast—the first conditional diffusion model designed for global weather forecasting. It conditions on historical observations and generates multivariate spatiotemporal weather fields via iterative denoising from Gaussian noise, enabling efficient ensemble forecasting and uncertainty quantification through stochastic sampling. Trained on ERA5 reanalysis data, CoDiCast supports forecasts of five or more meteorological variables at 6-hour intervals and 5.625° spatial resolution. It outperforms leading data-driven methods for 6-day forecasts while requiring only 12 minutes per inference on a single A100 GPU—overcoming the dual bottlenecks of NWP’s high computational cost and MLWP’s absence of uncertainty modeling.
📝 Abstract
Accurate weather forecasting is critical for science and society. Yet, existing methods have not managed to simultaneously have the properties of high accuracy, low uncertainty, and high computational efficiency. On one hand, to quantify the uncertainty in weather predictions, the strategy of ensemble forecast (i.e., generating a set of diverse predictions) is often employed. However, traditional ensemble numerical weather prediction (NWP) is computationally intensive. On the other hand, most existing machine learning-based weather prediction (MLWP) approaches are efficient and accurate. Nevertheless, they are deterministic and cannot capture the uncertainty of weather forecasting. In this work, we propose CoDiCast, a conditional diffusion model to generate accurate global weather prediction, while achieving uncertainty quantification with ensemble forecasts and modest computational cost. The key idea is to simulate a conditional version of the reverse denoising process in diffusion models, which starts from pure Gaussian noise to generate realistic weather scenarios for a future time point. Each denoising step is conditioned on observations from the recent past. Ensemble forecasts are achieved by repeatedly sampling from stochastic Gaussian noise to represent uncertainty quantification. CoDiCast is trained on a decade of ERA5 reanalysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF). Experimental results demonstrate that our approach outperforms several existing data-driven methods in accuracy. Our conditional diffusion model, CoDiCast, can generate 6-day global weather forecasts, at 6-hour steps and $5.625^circ$ latitude-longitude resolution, for over 5 variables, in about 12 minutes on a commodity A100 GPU machine with 80GB memory. The open-souced code is provided at https://github.com/JimengShi/CoDiCast.