π€ AI Summary
Existing PDE foundation models rely heavily on large-scale Transformer architectures, incurring prohibitive parameter counts and computational costs. To address this, we propose SPUSβthe first unified neural operator foundation model built upon a lightweight residual U-Net architecture. Our key contributions are threefold: (i) the first integration of residual U-Net into PDE foundation modeling; (ii) a physics-informed autoregressive pretraining strategy that explicitly emulates numerical PDE solvers to learn conservation laws and dynamical evolution; and (iii) joint pretraining on multi-physics fluid PDEs coupled with few-shot transfer mechanisms. Experiments demonstrate that SPUS achieves state-of-the-art generalization across six unseen PDE tasks, reducing parameter count by one to two orders of magnitude compared to mainstream approaches. Moreover, it adapts rapidly to novel equations with only a few fine-tuning samples.
π Abstract
We introduce Small PDE U-Net Solver (SPUS), a compact and efficient foundation model (FM) designed as a unified neural operator for solving a wide range of partial differential equations (PDEs). Unlike existing state-of-the-art PDE FMs-primarily based on large complex transformer architectures with high computational and parameter overhead-SPUS leverages a lightweight residual U-Net-based architecture that has been largely underexplored as a foundation model architecture in this domain. To enable effective learning in this minimalist framework, we utilize a simple yet powerful auto-regressive pretraining strategy which closely replicates the behavior of numerical solvers to learn the underlying physics. SPUS is pretrained on a diverse set of fluid dynamics PDEs and evaluated across 6 challenging unseen downstream PDEs spanning various physical systems. Experimental results demonstrate that SPUS using residual U-Net based architecture achieves state-of-the-art generalization on these downstream tasks while requiring significantly fewer parameters and minimal fine-tuning data, highlighting its potential as a highly parameter-efficient FM for solving diverse PDE systems.