On the rates of convergence for learning with convolutional neural networks

📅 2024-03-25

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

🤖 AI Summary

This paper investigates the approximation capacity and statistical convergence rates of convolutional neural networks (CNNs) with one-sided zero padding and multi-channel architectures in nonparametric regression and binary classification. Methodologically, it establishes the first compact approximation bound under weight constraints and develops a novel covering-number analysis framework that jointly accounts for network architecture and weight magnitudes, yielding a new upper bound on covering numbers. Consequently, it provides the first theoretical guarantee that the CNN least-squares estimator achieves the minimax optimal rate over Sobolev classes of smooth functions; similarly, CNN classifiers trained with hinge or logistic loss attain minimax optimal rates in binary classification. These results characterize the fundamental statistical efficiency limits of structured CNNs in nonparametric learning, offering rigorous theoretical foundations and practical guidance for architectural design.

Technology Category

Application Category

📝 Abstract

We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels. Our first result proves a new approximation bound for CNNs with certain constraint on the weights. Our second result gives new analysis on the covering number of feed-forward neural networks with CNNs as special cases. The analysis carefully takes into account the size of the weights and hence gives better bounds than the existing literature in some situations. Using these two results, we are able to derive rates of convergence for estimators based on CNNs in many learning problems. In particular, we establish minimax optimal convergence rates of the least squares based on CNNs for learning smooth functions in the nonparametric regression setting. For binary classification, we derive convergence rates for CNN classifiers with hinge loss and logistic loss. It is also shown that the obtained rates for classification are minimax optimal in some common settings.

Problem

Research questions and friction points this paper is trying to address.

Analyzing approximation capacities of CNNs with specific constraints

Deriving convergence rates for CNN-based estimators in learning

Establishing minimax optimal rates for CNN classifiers in classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

CNNs with one-side zero-padding and multiple channels

New approximation bound with weight constraints

Minimax optimal convergence rates for CNNs

🔎 Similar Papers

No similar papers found.

Authors to Follow