đ€ AI Summary
This work addresses the challenge of enforcing physical symmetriesâsuch as rotational equivarianceâin machine learning models without imposing explicit architectural constraints. The authors propose a general, architecture-agnostic approach that introduces a novel metric to quantify the degree of symmetry learning, employs spectral analysis to diagnose failure modes, and leverages targeted data augmentation to guide unconstrained Transformersâincluding graph neural networks and PointNet-style architecturesâto progressively approximate equivariance across layers during training. Experiments demonstrate that injecting only the minimal necessary inductive bias substantially enhances physical fidelity, numerical stability, and predictive accuracy, while preserving the modelâs expressive capacity.
đ Abstract
The requirement of generating predictions that exactly fulfill the fundamental symmetry of the corresponding physical quantities has profoundly shaped the development of machine-learning models for physical simulations. In many cases, models are built using constrained mathematical forms that ensure that symmetries are enforced exactly. However, unconstrained models that do not obey rotational symmetries are often found to have competitive performance, and to be able to \emph{learn} to a high level of accuracy an approximate equivariant behavior with a simple data augmentation strategy. In this paper, we introduce rigorous metrics to measure the symmetry content of the learned representations in such models, and assess the accuracy by which the outputs fulfill the equivariant condition. We apply these metrics to two unconstrained, transformer-based models operating on decorated point clouds (a graph neural network for atomistic simulations and a PointNet-style architecture for particle physics) to investigate how symmetry information is processed across architectural layers and is learned during training. Based on these insights, we establish a rigorous framework for diagnosing spectral failure modes in ML models. Enabled by this analysis, we demonstrate that one can achieve superior stability and accuracy by strategically injecting the minimum required inductive biases, preserving the high expressivity and scalability of unconstrained architectures while guaranteeing physical fidelity.