🤖 AI Summary
This work addresses the lack of a unified theoretical framework for the generalization behavior of deep residual networks (ResNets) under both discrete and continuous-time formulations, particularly regarding discrepancies in sample complexity and underlying assumptions. By adopting a dynamical systems perspective and integrating Rademacher complexity, flow maps, and the convergence of ResNets in the infinite-depth limit, the paper establishes the first depth-independent generalization bound that incorporates a structure-dependent negative term. This approach unifies the characterization of generalization for both discrete and continuous ResNets under weaker assumptions and yields a generalization error bound of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$, thereby effectively bridging the theoretical gap between the two settings.
📝 Abstract
Deep neural networks (DNNs) have significantly advanced machine learning, with model depth playing a central role in their successes. The dynamical system modeling approach has recently emerged as a powerful framework, offering new mathematical insights into the structure and learning behavior of DNNs. In this work, we establish generalization error bounds for both discrete- and continuous-time residual networks (ResNets) by combining Rademacher complexity, flow maps of dynamical systems, and the convergence behavior of ResNets in the deep-layer limit. The resulting bounds are of order $O(1/\sqrt{S})$ with respect to the number of training samples $S$, and include a structure-dependent negative term, yielding depth-uniform and asymptotic generalization bounds under milder assumptions. These findings provide a unified understanding of generalization across both discrete- and continuous-time ResNets, helping to close the gap in both the order of sample complexity and assumptions between the discrete- and continuous-time settings.