Layer Separation Deep Learning Model with Auxiliary Variables for Partial Differential Equations

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Optimizing deep neural networks for solving partial differential equations (PDEs) remains challenging due to highly nonconvex loss landscapes, leading to poor convergence, local minima trapping, and gradient explosion/vanishing. To address this, we propose the Layer-Separation (LySep) model. LySep introduces auxiliary variables to decouple strong inter-layer dependencies in deep networks, reformulating the original problem into an equivalent form where coupling occurs only between adjacent layers. We then develop an optimization framework based on the Alternating Direction Method of Multipliers (ADMM), enabling closed-form updates for most variables. We theoretically prove that LySep is equivalent to the original neural network model in terms of optimal solutions. Numerical experiments demonstrate that LySep significantly reduces both training loss and PDE solution error, consistently outperforming state-of-the-art physics-informed neural networks (PINNs) and other baselines—particularly in high-dimensional settings.

Technology Category

Application Category

📝 Abstract

In this paper, we propose a new optimization framework, the layer separation (LySep) model, to improve the deep learning-based methods in solving partial differential equations. Due to the highly non-convex nature of the loss function in deep learning, existing optimization algorithms often converge to suboptimal local minima or suffer from gradient explosion or vanishing, resulting in poor performance. To address these issues, we introduce auxiliary variables to separate the layers of deep neural networks. Specifically, the output and its derivatives of each layer are represented by auxiliary variables, effectively decomposing the deep architecture into a series of shallow architectures. New loss functions with auxiliary variables are established, in which only variables from two neighboring layers are coupled. Corresponding algorithms based on alternating directions are developed, where many variables can be updated optimally in closed forms. Moreover, we provide theoretical analyses demonstrating the consistency between the LySep model and the original deep model. High-dimensional numerical results validate our theory and demonstrate the advantages of LySep in minimizing loss and reducing solution error.

Problem

Research questions and friction points this paper is trying to address.

Improves deep learning for PDEs with layer separation

Addresses non-convex loss and gradient issues via auxiliary variables

Decomposes deep networks into shallow architectures for better optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces auxiliary variables for layer separation

Decomposes deep networks into shallow architectures

Develops alternating direction optimization algorithms

🔎 Similar Papers

Deep Parallel Spectral Neural Operators for Solving Partial Differential Equations with Enhanced Low-Frequency Learning Capability