Flowers: A Warp Drive for Neural PDE Solvers

📅 2026-02-17

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work proposes a purely deformation-driven neural architecture for efficiently learning solution operators of time-dependent partial differential equations (PDEs), such as those governing fluid dynamics and wave propagation. Departing from conventional approaches that rely on Fourier multipliers, convolutions, or dot-product attention, the method introduces non-local interactions exclusively through multi-head displacement field prediction and sparse source coordinate sampling. This enables adaptive global modeling within multiscale residual blocks while maintaining linear computational complexity. Experimental results demonstrate that the 17M-parameter model outperforms mainstream baselines of comparable size across diverse 2D and 3D time-dependent PDE tasks, and its 150M-parameter variant surpasses even larger Transformer-based models, confirming the efficacy and scalability of deformation-based mechanisms for PDE solving.

📝 Abstract

We introduce Flowers, a neural architecture for learning PDE solution operators built entirely from multihead warps. Aside from pointwise channel mixing and a multiscale scaffold, Flowers use no Fourier multipliers, no dot-product attention, and no convolutional mixing. Each head predicts a displacement field and warps the mixed input features. Motivated by physics and computational efficiency, displacements are predicted pointwise, without any spatial aggregation, and nonlocality enters \emph{only} through sparse sampling at source coordinates, \emph{one} per head. Stacking warps in multiscale residual blocks yields Flowers, which implement adaptive, global interactions at linear cost. We theoretically motivate this design through three complementary lenses: flow maps for conservation laws, waves in inhomogeneous media, and a kinetic-theoretic continuum limit. Flowers achieve excellent performance on a broad suite of 2D and 3D time-dependent PDE benchmarks, particularly flows and waves. A compact 17M-parameter model consistently outperforms Fourier, convolution, and attention-based baselines of similar size, while a 150M-parameter variant improves over recent transformer-based foundation models with much more parameters, data, and training compute.

Problem

Research questions and friction points this paper is trying to address.

neural PDE solvers

solution operators

nonlocal interactions

computational efficiency

time-dependent PDE

Innovation

Methods, ideas, or system contributions that make the work stand out.

neural PDE solvers

multihead warps

displacement fields