🤖 AI Summary
This study exposes fundamental limitations of data-driven modeling for multiscale dynamical systems—such as neural simulators and stochastic climate models: while existing autoregressive neural models reproduce static statistical properties, they fail to capture dynamic responses to external forcing; under partial observability, they further suffer from joint challenges of variable selection and parameterization of unobserved degrees of freedom. To address these issues, we propose a physics-guided modeling paradigm that integrates coarse-graining principles with stochastic parameterization, yielding a low-dimensional autoregressive stochastic neural model tailored for coupled sea surface temperature and top-of-atmosphere radiative flux dynamics. Experiments demonstrate that, under full observability, the model accurately reproduces mean and variance responses; under partial observability, it significantly improves modeling fidelity for slow dynamics and enhances interpretability of climate response predictions—establishing a novel framework for out-of-equilibrium generalization.
📝 Abstract
This work explores key conceptual limitations in data-driven modeling of multiscale dynamical systems, focusing on neural emulators and stochastic climate modeling. A skillful climate model should capture both stationary statistics and responses to external perturbations. While current autoregressive neural models often reproduce the former, they typically struggle with the latter. We begin by analyzing a low-dimensional dynamical system to expose, by analogy, fundamental limitations that persist in high-dimensional settings. Specifically, we construct neural stochastic models under two scenarios: one where the full state vector is observed, and another with only partial observations (i.e. a subset of variables). In the first case, the models accurately capture both equilibrium statistics and forced responses in ensemble mean and variance. In the more realistic case of partial observations, two key challenges emerge: (i) identifying the extit{proper} variables to model, and (ii) parameterizing the influence of unobserved degrees of freedom. These issues are not specific to neural networks but reflect fundamental limitations of data-driven modeling and the need to target the slow dynamics of the system. We argue that physically grounded strategies -- such as coarse-graining and stochastic parameterizations -- are critical, both conceptually and practically, for the skillful emulation of complex systems like the coupled climate system. Building on these insights, we turn to a more realistic application: a stochastic reduced neural model of the sea surface temperature field and the net radiative flux at the top of the atmosphere, assessing its stationary statistics, response to temperature forcing, and interpretability.