🤖 AI Summary
Standard teacher-forcing training for long-horizon autoregressive forecasting of time-varying partial differential equations (PDEs) induces train–inference mismatch and exponential error accumulation. Method: We propose the Recurrent Neural Operator (RNO) framework—the first to integrate recurrent training into neural operator architectures—ensuring strict alignment between training dynamics and autoregressive inference. RNO combines a multigrid neural operator (MgNO), sliding temporal window recursion, and learning of the PDE solution operator. Contribution/Results: We theoretically prove that RNO suppresses worst-case error growth from exponential to linear and eliminates exposure bias entirely. On standard PDE benchmarks, RNO reduces long-horizon prediction error by over 40%, achieves more robust convergence, and significantly improves both accuracy and stability.
📝 Abstract
Neural operators have emerged as powerful tools for learning solution operators of partial differential equations. However, in time-dependent problems, standard training strategies such as teacher forcing introduce a mismatch between training and inference, leading to compounding errors in long-term autoregressive predictions. To address this issue, we propose Recurrent Neural Operators (RNOs)-a novel framework that integrates recurrent training into neural operator architectures. Instead of conditioning each training step on ground-truth inputs, RNOs recursively apply the operator to their own predictions over a temporal window, effectively simulating inference-time dynamics during training. This alignment mitigates exposure bias and enhances robustness to error accumulation. Theoretically, we show that recurrent training can reduce the worst-case exponential error growth typical of teacher forcing to linear growth. Empirically, we demonstrate that recurrently trained Multigrid Neural Operators significantly outperform their teacher-forced counterparts in long-term accuracy and stability on standard benchmarks. Our results underscore the importance of aligning training with inference dynamics for robust temporal generalization in neural operator learning.