Conservation Laws from Data Symmetry in Neural Networks

📅 2026-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether intrinsic symmetries in training data induce conserved quantities during gradient flow training of neural networks. By integrating tools from differential geometry and dynamical systems theory, the study establishes—for the first time—a systematic connection between data symmetries and conservation laws in training dynamics, employing tensorized networks (including linear, polynomial, and Lightning Attention architectures) as an analytical framework. The analysis demonstrates that, under general non-polynomial losses, data symmetries do not yield additional conserved quantities; however, when combined with data augmentation under mean squared error (MSE) loss, novel conserved quantities emerge. This finding uncovers a distinctive conservation mechanism specific to MSE loss and offers a new perspective for understanding the dynamics of neural network training.
📝 Abstract
We explore whether intrinsic symmetries of the training data lead to conserved quantities during gradient-flow training of neural networks. Under the assumption that the loss function is analytic and non-polynomial, we prove that data symmetries generically do not induce any additional integrals of motion. For mean squared error (MSE) loss, on the other hand, there are situations in which data augmentation yields extra conserved quantities. We build a framework, utilizing \emph{tensorizable networks} to describe this phenomenon. Tensorizable networks are a family of architectures whose dependence on parameters and inputs can be separated using an intermediate representation. They include linear and polynomial networks, as well as Lightning Attention.
Problem

Research questions and friction points this paper is trying to address.

conservation laws
data symmetry
neural networks
gradient flow
integrals of motion
Innovation

Methods, ideas, or system contributions that make the work stand out.

tensorizable networks
data symmetry
conservation laws
gradient flow
MSE loss
🔎 Similar Papers
No similar papers found.