🤖 AI Summary
This work investigates the interpolation–generalization trade-off of semi-autonomous neural ordinary differential equations (SA-NODEs) in supervised regression from a control-theoretic perspective. By introducing the novel notion of simultaneous cell controllability (SCC), the authors constructively demonstrate that SA-NODEs can achieve exact interpolation while emulating nonparametric estimators, thereby yielding quantifiable generalization bounds. The theoretical analysis reveals that explicit time dependence is a crucial mechanism enabling SA-NODEs to simultaneously attain strong interpolation capability and favorable generalization performance, highlighting inherent structural limitations of autonomous neural ODEs. The derived generalization rates match those of classical histogram and nearest-neighbor estimators, and empirical results further confirm the superiority of SA-NODEs in test error.
📝 Abstract
We study supervised regression with neural ODEs (NODEs) from a control-theoretic perspective to derive explicit population-risk bounds. We focus on a widely used class of non-autonomous models with constant parameters and explicit time dependence, which we call semi-autonomous NODEs (SA-NODEs). We constructively prove that SA-NODEs are capable of \emph{exact} interpolation of admissible finite datasets, and even satisfy a stronger property that we call \emph{simultaneous cell controllability} (SCC): their flows can map prescribed disjoint cells into arbitrarily small target balls. This property is the mechanism that upgrades interpolation into quantitative generalization, by allowing SA-NODEs to emulate piecewise-constant nonparametric estimators. Consequently, our risk bounds recover the rates of histogram and nearest-neighbor estimators, provided the network width satisfies a conservative scaling with the sample size. Numerical experiments show that trained SA-NODEs achieve competitive -- often lower -- test errors than these baselines. Finally, we show that the explicit time dependence is essential. Although two-layer autonomous NODEs can interpolate geometrically nondegenerate datasets, structural obstructions prevent them from achieving SCC. These limitations, further confirmed numerically, support the view that SA-NODEs provide a minimal effective architecture for learning.