🤖 AI Summary
This work studies the $L_q(Omega)$-norm approximation of functions in the Sobolev space $W^{s}(L_p(Omega))$ by shallow ReLU$^k$ neural networks on a bounded domain $Omega subset mathbb{R}^d$. To overcome the smoothness barrier imposed by the piecewise polynomial structure of ReLU$^k$ networks, we innovatively integrate Radon transform techniques with discrepancy theory. Our main result establishes an almost-optimal approximation rate of $O(n^{-s/d})$, up to logarithmic factors, under the condition $s leq k + (d+1)/2$. This rate holds for a broad regime where $q leq p$ and $p geq 2$, significantly improving and generalizing prior bounds. The analysis reveals the strong adaptive approximation capability of shallow ReLU$^k$ networks for highly smooth functions, demonstrating that their expressive power extends beyond classical piecewise-polynomial limitations when leveraging geometric integral representations.
📝 Abstract
Let $Omegasubset mathbb{R}^d$ be a bounded domain. We consider the problem of how efficiently shallow neural networks with the ReLU$^k$ activation function can approximate functions from Sobolev spaces $W^s(L_p(Omega))$ with error measured in the $L_q(Omega)$-norm. Utilizing the Radon transform and recent results from discrepancy theory, we provide a simple proof of nearly optimal approximation rates in a variety of cases, including when $qleq p$, $pgeq 2$, and $s leq k + (d+1)/2$. The rates we derive are optimal up to logarithmic factors, and significantly generalize existing results. An interesting consequence is that the adaptivity of shallow ReLU$^k$ neural networks enables them to obtain optimal approximation rates for smoothness up to order $s = k + (d+1)/2$, even though they represent piecewise polynomials of fixed degree $k$.