🤖 AI Summary
This study systematically evaluates the accuracy and computational efficiency of AI-based versus physics-based models in medium-range forecasting of 10-meter wind speed. Leveraging observational data from over 9,000 global stations during July–November 2025, it presents the first large-scale operational comparison between ECMWF’s AI-driven AIFS and its traditional physics-based IFS model, complemented by both parametric ensemble model output statistics (EMOS) and nonparametric quantile regression (QR) for post-processing to correct systematic biases. Results show that the raw IFS significantly outperforms AIFS; however, both post-processing methods substantially improve forecast skill—with EMOS yielding greater gains—and markedly reduce the performance gap. After calibration, IFS retains only a marginal advantage at short lead times. This work provides critical empirical evidence supporting the operational deployment of AI-based weather prediction systems.
📝 Abstract
In the last few years, AI-based models have become the centre of attention in weather forecasting due to their increasing accuracy and efficiency. Pioneering among weather services, ECMWF has developed its Artificial Intelligence Forecasting System (AIFS) model, which was first to provide data-driven ensemble forecasts in June 2024. Since July 2025, the AIFS ensemble model has been operational and runs in parallel with ECMWF's physics-based Integrated Forecasting System (IFS), which is considered the gold standard in weather prediction. The new AIFS model can generate forecasts ten times faster than the classical numerical weather prediction model, while consuming approximately a thousand times less energy. We present the results of our systematic assessment of the performance of the IFS and AIFS models by comparing the accuracy of raw and post-processed medium-range 10-m wind-speed ensemble forecasts generated operationally by the two models for the period between July and November 2025 for more than 9000 synoptic observation stations across the globe. The post-processed case involves the parametric ensemble model output statistics (EMOS) as well as the non-parametric quantile regression (QR) approach to correct any systematic inaccuracies in the raw forecasts. The predictive performance of raw IFS ensemble forecasts proves to be substantially superior to the skill of the raw AIFS predictions for all investigated forecast horizons. As expected, post-processing significantly improves the skill of both IFS and AIFS predictions, and, across most verification metrics, EMOS is superior to QR, especially for short lead times. Compared to the raw ensemble, the differences in skill between the matching IFS and AIFS predictions are substantially decreased by post-processing and are mostly significant at short lead times, when the IFS forecasts outperform their AIFS counterparts.