🤖 AI Summary
This paper investigates the approximation capacity and statistical convergence rates of convolutional neural networks (CNNs) with one-sided zero padding and multi-channel architectures in nonparametric regression and binary classification. Methodologically, it establishes the first compact approximation bound under weight constraints and develops a novel covering-number analysis framework that jointly accounts for network architecture and weight magnitudes, yielding a new upper bound on covering numbers. Consequently, it provides the first theoretical guarantee that the CNN least-squares estimator achieves the minimax optimal rate over Sobolev classes of smooth functions; similarly, CNN classifiers trained with hinge or logistic loss attain minimax optimal rates in binary classification. These results characterize the fundamental statistical efficiency limits of structured CNNs in nonparametric learning, offering rigorous theoretical foundations and practical guidance for architectural design.
📝 Abstract
We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels. Our first result proves a new approximation bound for CNNs with certain constraint on the weights. Our second result gives new analysis on the covering number of feed-forward neural networks with CNNs as special cases. The analysis carefully takes into account the size of the weights and hence gives better bounds than the existing literature in some situations. Using these two results, we are able to derive rates of convergence for estimators based on CNNs in many learning problems. In particular, we establish minimax optimal convergence rates of the least squares based on CNNs for learning smooth functions in the nonparametric regression setting. For binary classification, we derive convergence rates for CNN classifiers with hinge loss and logistic loss. It is also shown that the obtained rates for classification are minimax optimal in some common settings.