🤖 AI Summary
Conventional operator learning methods for nonlinear partial differential equations (PDEs) require equation-specific modeling and exhibit poor generalization across distinct PDEs.
Method: We propose a physics-informed universal operator learning framework built upon DeepONet, incorporating physical constraints directly into the learning process.
Contribution/Results: We establish, for the first time, a rigorous generalization error upper bound for DeepONet in solving nonlinear PDEs under the Sobolev norm—integrating Rademacher complexity and pseudo-dimension theory to theoretically justify the optimality of the “deep branch + shallow trunk” architecture. The derived bound explicitly incorporates derivative constraints, ensuring simultaneous approximation of both solutions and their gradients. Experiments demonstrate zero-shot cross-equation transferability across diverse nonlinear PDEs—including Burgers’, Allen–Cahn, and Navier–Stokes equations—achieving substantial improvements in computational efficiency and generalization accuracy without retraining.
📝 Abstract
In this paper, we investigate the use of operator learning, specifically DeepONet, for solving nonlinear partial differential equations (PDEs). Unlike conventional function learning methods that require training separate neural networks for each PDE, operator learning enables generalization across different PDEs without retraining. This study examines the performance of DeepONet in physics-informed training, focusing on two key aspects: (1) the approximation capabilities of deep branch and trunk networks, and (2) the generalization error in Sobolev norms. Our results demonstrate that deep branch networks provide substantial performance improvements, while trunk networks achieve optimal results when kept relatively simple. Furthermore, we derive a bound on the generalization error of DeepONet for solving nonlinear PDEs by analyzing the Rademacher complexity of its derivatives in terms of pseudo-dimension. This work bridges a critical theoretical gap by delivering rigorous error estimates. This paper fills a theoretical gap by providing error estimations for a wide range of physics-informed machine learning models and applications.