🤖 AI Summary
This paper investigates whether the compositional structure of learning tasks underlies the superiority of deep networks over classical models such as kernel methods. To formalize compositional structure, we introduce the Multi-Index Model (MIM) as a minimal yet expressive benchmark. We then propose Hyperkernel Ridge Regression (HKRR), the first kernel-based method adaptively extended to MIM learning, which integrates neural networks’ structural inductive biases with kernel methods’ theoretical interpretability. We design specialized optimization algorithms based on alternating minimization and alternating gradients, and derive a tight sample complexity upper bound for HKRR. Theoretically and empirically, HKRR significantly outperforms standard kernel methods on MIM tasks, effectively mitigating the curse of dimensionality. Our results elucidate how structural priors enhance high-dimensional learning performance and offer a novel perspective on the origins of deep learning’s empirical success.
📝 Abstract
Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow the idea that the compositional structure of the learning task is the key factor determining when deep networks outperform other approaches. Taking a step towards formalizing this idea, we consider a simple compositional model, namely the multi-index model (MIM). In this context, we introduce and study hyper-kernel ridge regression (HKRR), an approach blending neural networks and kernel methods. Our main contribution is a sample complexity result demonstrating that HKRR can adaptively learn MIM, overcoming the curse of dimensionality. Further, we exploit the kernel nature of the estimator to develop ad hoc optimization approaches. Indeed, we contrast alternating minimization and alternating gradient methods both theoretically and numerically. These numerical results complement and reinforce our theoretical findings.