LeMON: Learning to Learn Multi-Operator Networks

📅 2024-08-28
🏛️ arXiv.org
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
Generalizing operator learning to unseen partial differential equations (PDEs) under limited training samples remains challenging. Method: We propose a multi-operator joint pretraining and lightweight fine-tuning framework. It introduces the first PDE-agnostic meta-initialization strategy, a unified multi-operator embedding architecture, and integrates a MAML variant, LoRA-based adaptation, and cross-PDE-family pretraining. Contributions/Results: We theoretically establish a scaling law linking operator family size to generalization performance. Our method enables true zero-shot prediction for novel PDE operators—achieving <5% prediction error without any task-specific data. Empirically, it significantly outperforms single-operator baselines across diverse PDE families; adapts efficiently with only a few samples (<5) per target operator; reduces computational overhead by 40% while improving accuracy. The framework thus bridges the gap between broad applicability and sample efficiency in operator learning.

Technology Category

Application Category

📝 Abstract
Single-operator learning involves training a deep neural network to learn a specific operator, whereas recent work in multi-operator learning uses an operator embedding structure to train a single neural network on data from multiple operators. Thus, multi-operator learning is capable of predicting a range of operators within one model. In this work, we propose pretraining and fine-tuning strategies for solving PDEs using multi-operator learning. One key aspect is that by increasing the number of families of operators used in pretraining, a PDE foundation model can be fine-tuned to downstream tasks involving new PDEs with a limited number of samples, thus outperforming single operator neural networks. Specifically, a multi-operator learning model pre-trained with data from diverse PDE families can predict unseen operators after fine-tuning with only a limited number of operators from the new family, enabling them to serve as a data-free PDE solver. We also show that the proposed training and fine-tuning method is able to predict new operators in zero-shot prediction without samples. Additionally, we introduce a PDE-agnostic meta-learning algorithm to improve the adaptability of the model to various PDEs by providing a better parameter initialization process. To address the needs of applications with limited computing resources, we explore low-rank adaptation methods that reduce computational costs while enhancing solver accuracy. Lastly, by examining the scaling law with respect to the number of operator families, we establish and highlight its potential for broad adaptation in PDE-solving tasks.
Problem

Research questions and friction points this paper is trying to address.

Develop pretraining and fine-tuning for multi-operator PDE solving
Enable zero-shot prediction of new PDE operators without samples
Improve adaptability and reduce costs for limited-resource PDE applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pretraining and fine-tuning multi-operator PDE models
PDE-agnostic meta-learning for better initialization
Low-rank adaptation to reduce computational costs
🔎 Similar Papers
No similar papers found.