Learning Multi-Index Models with Hyper-Kernel Ridge Regression

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper investigates whether the compositional structure of learning tasks underlies the superiority of deep networks over classical models such as kernel methods. To formalize compositional structure, we introduce the Multi-Index Model (MIM) as a minimal yet expressive benchmark. We then propose Hyperkernel Ridge Regression (HKRR), the first kernel-based method adaptively extended to MIM learning, which integrates neural networks’ structural inductive biases with kernel methods’ theoretical interpretability. We design specialized optimization algorithms based on alternating minimization and alternating gradients, and derive a tight sample complexity upper bound for HKRR. Theoretically and empirically, HKRR significantly outperforms standard kernel methods on MIM tasks, effectively mitigating the curse of dimensionality. Our results elucidate how structural priors enhance high-dimensional learning performance and offer a novel perspective on the origins of deep learning’s empirical success.

Technology Category

Application Category

📝 Abstract

Deep neural networks excel in high-dimensional problems, outperforming models such as kernel methods, which suffer from the curse of dimensionality. However, the theoretical foundations of this success remain poorly understood. We follow the idea that the compositional structure of the learning task is the key factor determining when deep networks outperform other approaches. Taking a step towards formalizing this idea, we consider a simple compositional model, namely the multi-index model (MIM). In this context, we introduce and study hyper-kernel ridge regression (HKRR), an approach blending neural networks and kernel methods. Our main contribution is a sample complexity result demonstrating that HKRR can adaptively learn MIM, overcoming the curse of dimensionality. Further, we exploit the kernel nature of the estimator to develop ad hoc optimization approaches. Indeed, we contrast alternating minimization and alternating gradient methods both theoretically and numerically. These numerical results complement and reinforce our theoretical findings.

Problem

Research questions and friction points this paper is trying to address.

Learning multi-index models with hyper-kernel ridge regression

Overcoming the curse of dimensionality in kernel methods

Theoretical analysis of deep networks versus kernel methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hyper-kernel ridge regression blends neural networks and kernels

Adaptively learns multi-index models overcoming dimensionality curse

Uses alternating minimization and gradient methods for optimization

🔎 Similar Papers

No similar papers found.

Authors to Follow