Beyond Loss Guidance: Using PDE Residuals as Spectral Attention in Diffusion Neural Operators

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Existing diffusion-based PDE solvers rely on test-time gradient optimization, suffering from slow computation, instability, and poor robustness to noise. This paper introduces PRISMA—a conditional diffusion neural operator that for the first time embeds the PDE residual directly into the model architecture via a spectral-domain attention mechanism, enabling residual-driven dynamic feature modulation and eliminating test-time optimization entirely. Its core innovation lies in transforming the residual from an external loss signal into an internal spectral-domain attention feature, achieving architectural-level integration. Evaluated on five standard PDE benchmarks, PRISMA matches state-of-the-art accuracy while accelerating inference by 15–250×, reducing denoising steps by 10–100×, and significantly improving robustness under noisy residual conditions.

Technology Category

Application Category

📝 Abstract

Diffusion-based solvers for partial differential equations (PDEs) are often bottle-necked by slow gradient-based test-time optimization routines that use PDE residuals for loss guidance. They additionally suffer from optimization instabilities and are unable to dynamically adapt their inference scheme in the presence of noisy PDE residuals. To address these limitations, we introduce PRISMA (PDE Residual Informed Spectral Modulation with Attention), a conditional diffusion neural operator that embeds PDE residuals directly into the model's architecture via attention mechanisms in the spectral domain, enabling gradient-descent free inference. In contrast to previous methods that use PDE loss solely as external optimization targets, PRISMA integrates PDE residuals as integral architectural features, making it inherently fast, robust, accurate, and free from sensitive hyperparameter tuning. We show that PRISMA has competitive accuracy, at substantially lower inference costs, compared to previous methods across five benchmark PDEs, especially with noisy observations, while using 10x to 100x fewer denoising steps, leading to 15x to 250x faster inference.

Problem

Research questions and friction points this paper is trying to address.

Eliminates slow gradient-based optimization in PDE solvers

Addresses instability and noise sensitivity in diffusion models

Integrates PDE residuals as architectural features for efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses PDE residuals as spectral attention mechanisms

Integrates residuals into architecture for gradient-free inference

Achieves fast robust inference with fewer denoising steps

🔎 Similar Papers

A Unified Framework for Interpretable Transformers Using PDEs and Information Theory