Super-fast rates of convergence for Neural Networks Classifiers under the Hard Margin Condition

📅 2025-05-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the convergence rate of ReLU deep neural networks (DNNs) for binary classification under the Tsybakov low-noise condition with hard margin (i.e., noise exponent (q o infty)). We analyze the empirical risk minimization framework with squared loss and (ell_p)-norm regularization. A novel excess risk decomposition technique is introduced to decouple approximation, estimation, and optimization errors under this setting. We establish, for the first time, that when the regression function is sufficiently smooth, DNNs achieve arbitrarily fast super-polynomial convergence rates (O(n^{-alpha})) for any (alpha > 0), thereby breaking the classical (O(n^{-1})) bottleneck. The derived finite-sample upper bound on excess risk is tight and significantly improves statistical efficiency for high-confidence classifiers. This result provides a foundational theoretical guarantee for deep learning in strongly separable classification scenarios.

Technology Category

Application Category

📝 Abstract
We study the classical binary classification problem for hypothesis spaces of Deep Neural Networks (DNNs) with ReLU activation under Tsybakov's low-noise condition with exponent $q>0$, and its limit-case $q oinfty$ which we refer to as the"hard-margin condition". We show that DNNs which minimize the empirical risk with square loss surrogate and $ell_p$ penalty can achieve finite-sample excess risk bounds of order $mathcal{O}left(n^{-alpha} ight)$ for arbitrarily large $alpha>0$ under the hard-margin condition, provided that the regression function $eta$ is sufficiently smooth. The proof relies on a novel decomposition of the excess risk which might be of independent interest.
Problem

Research questions and friction points this paper is trying to address.

Study binary classification with DNNs under Tsybakov's low-noise condition.
Achieve fast excess risk bounds under hard-margin condition.
Prove bounds via novel excess risk decomposition.
Innovation

Methods, ideas, or system contributions that make the work stand out.

DNNs with ReLU activation for classification
Square loss surrogate with l_p penalty
Novel excess risk decomposition technique
🔎 Similar Papers
No similar papers found.
N
Nathanael Tepakbong
Department of Data Science, City University of Hong Kong, Hong Kong SAR
Ding-Xuan Zhou
Ding-Xuan Zhou
University of Sydney
theory of deep learningstatistical learningwaveletsapproximation theory
X
Xiang Zhou
Department of Mathematics, City University of Hong Kong. Hong Kong SAR