🤖 AI Summary
This work addresses the problem of learning operators between function spaces in two settings: (i) multi-operator learning—representing a family of operators parameterized by continuous variables using a single neural network; and (ii) multi-independent-operator learning—simultaneously modeling multiple heterogeneous operators. To this end, we propose two novel architectures: MNO and MONet. We establish the first general approximation theory for multi-operator learning and derive, for the first time, explicit scaling laws relating network size to approximation accuracy. Our method integrates deep neural networks, function-space approximation theory, and a PDE-informed multi-task learning framework, augmented with a complexity-balancing mechanism. Evaluated on parametric PDE benchmarks, our approach achieves significant improvements in both operator approximation accuracy and computational efficiency, bridging theoretical guarantees with empirical performance.
📝 Abstract
While many problems in machine learning focus on learning mappings between finite-dimensional spaces, scientific applications require approximating mappings between function spaces, i.e., operators. We study the problem of learning collections of operators and provide both theoretical and empirical advances. We distinguish between two regimes: (i) multiple operator learning, where a single network represents a continuum of operators parameterized by a parametric function, and (ii) learning several distinct single operators, where each operator is learned independently. For the multiple operator case, we introduce two new architectures, $mathrm{MNO}$ and $mathrm{MONet}$, and establish universal approximation results in three settings: continuous, integrable, or Lipschitz operators. For the latter, we further derive explicit scaling laws that quantify how the network size must grow to achieve a target approximation accuracy. For learning several single operators, we develop a framework for balancing architectural complexity across subnetworks and show how approximation order determines computational efficiency. Empirical experiments on parametric PDE benchmarks confirm the strong expressive power and efficiency of the proposed architectures. Overall, this work establishes a unified theoretical and practical foundation for scalable neural operator learning across multiple operators.