🤖 AI Summary
This work addresses the lack of theoretical understanding regarding the quantitative approximation rates of group-equivariant neural networks under symmetry constraints. Focusing on α-Hölder continuous group-equivariant and invariant functions, the paper establishes the first rigorous, quantitative error bounds for prominent equivariant architectures—including Deep Sets, Sumformer, Transformer, and frame averaging—by integrating tools from group representation theory with the structural properties of deep learning models. The analysis demonstrates that hard-coding symmetries does not compromise expressive power: these equivariant networks achieve approximation error rates of the same order as ReLU multilayer perceptrons (MLPs) of comparable size when approximating symmetric functions, thereby filling a critical gap in the theoretical foundations of equivariant deep learning.
📝 Abstract
The universal approximation theorem establishes that neural networks can approximate any continuous function on a compact set. Later works in approximation theory provide quantitative approximation rates for ReLU networks on the class of $\alpha$-H\"older functions $f: [0,1]^N \to \mathbb{R}$. The goal of this paper is to provide similar quantitative approximation results in the context of group equivariant learning, where the learned $\alpha$-H\"older function is known to obey certain group symmetries. While there has been much interest in the literature in understanding the universal approximation properties of equivariant models, very few quantitative approximation results are known for equivariant models. In this paper, we bridge this gap by deriving quantitative approximation rates for several prominent group-equivariant and invariant architectures. The architectures that we consider include: the permutation-invariant Deep Sets architecture; the permutation-equivariant Sumformer and Transformer architectures; joint invariance to permutations and rigid motions using invariant networks based on frame averaging; and general bi-Lipschitz invariant models. Overall, we show that equally-sized ReLU MLPs and equivariant architectures are equally expressive over equivariant functions. Thus, hard-coding equivariance does not result in a loss of expressivity or approximation power in these models.