🤖 AI Summary
This paper investigates the distributionally robust universal approximation property of neural networks over weakly compact families of probability measures. For mainstream architectures—including feedforward networks, deep narrow networks, and functional-input networks—it establishes, for the first time, a unified density theory in Orlicz spaces—rather than conventional $L^p$ spaces. Methodologically, the work integrates functional analysis, Orlicz space theory, and weak convergence techniques, thereby removing the reliance on polynomial activation functions and successfully accommodating nonsmooth activations (e.g., ReLU) and deep narrow topologies. The main contribution is the rigorous characterization of generalized approximation capability under distributional robustness, substantially extending the scope of classical approximation theory and providing a more rigorous mathematical foundation for robust learning and uncertainty modeling.
📝 Abstract
The universal approximation property uniformly with respect to weakly compact families of measures is established for several classes of neural networks. To that end, we prove that these neural networks are dense in Orlicz spaces, thereby extending classical universal approximation theorems even beyond the traditional $L^p$-setting. The covered classes of neural networks include widely used architectures like feedforward neural networks with non-polynomial activation functions, deep narrow networks with ReLU activation functions and functional input neural networks.