🤖 AI Summary
This paper investigates the convergence rate of ReLU deep neural networks (DNNs) for binary classification under the Tsybakov low-noise condition with hard margin (i.e., noise exponent (q o infty)). We analyze the empirical risk minimization framework with squared loss and (ell_p)-norm regularization. A novel excess risk decomposition technique is introduced to decouple approximation, estimation, and optimization errors under this setting. We establish, for the first time, that when the regression function is sufficiently smooth, DNNs achieve arbitrarily fast super-polynomial convergence rates (O(n^{-alpha})) for any (alpha > 0), thereby breaking the classical (O(n^{-1})) bottleneck. The derived finite-sample upper bound on excess risk is tight and significantly improves statistical efficiency for high-confidence classifiers. This result provides a foundational theoretical guarantee for deep learning in strongly separable classification scenarios.
📝 Abstract
We study the classical binary classification problem for hypothesis spaces of Deep Neural Networks (DNNs) with ReLU activation under Tsybakov's low-noise condition with exponent $q>0$, and its limit-case $q oinfty$ which we refer to as the"hard-margin condition". We show that DNNs which minimize the empirical risk with square loss surrogate and $ell_p$ penalty can achieve finite-sample excess risk bounds of order $mathcal{O}left(n^{-alpha}
ight)$ for arbitrarily large $alpha>0$ under the hard-margin condition, provided that the regression function $eta$ is sufficiently smooth. The proof relies on a novel decomposition of the excess risk which might be of independent interest.