Random at First, Fast at Last: NTK-Guided Fourier Pre-Processing for Tabular DL

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Deep learning models for tabular data suffer from slow convergence, high hyperparameter sensitivity, and unstable early-stage training. To address these challenges, this paper proposes a Random Fourier Features (RFF)-based preprocessing method guided by Neural Tangent Kernel (NTK) theory. The method non-linearly maps原始 features into a fixed-frequency domain, yielding a parameter- and architecture-agnostic plug-and-play module. We theoretically establish that it effectively constrains the initial NTK spectrum and introduces beneficial gradient flow bias, thereby accelerating optimization dynamics. Empirical evaluation across multiple standard tabular benchmarks demonstrates substantial improvements: training iterations are reduced by 30–50% to achieve comparable performance, hyperparameter tuning becomes significantly less sensitive, and generalization capability is enhanced. The approach thus offers a principled, lightweight, and broadly applicable solution for improving the training efficiency and robustness of deep tabular models.

Technology Category

Application Category

📝 Abstract

While random Fourier features are a classic tool in kernel methods, their utility as a pre-processing step for deep learning on tabular data has been largely overlooked. Motivated by shortcomings in tabular deep learning pipelines - revealed through Neural Tangent Kernel (NTK) analysis - we revisit and repurpose random Fourier mappings as a parameter-free, architecture-agnostic transformation. By projecting each input into a fixed feature space via sine and cosine projections with frequencies drawn once at initialization, this approach circumvents the need for ad hoc normalization or additional learnable embeddings. We show within the NTK framework that this mapping (i) bounds and conditions the network's initial NTK spectrum, and (ii) introduces a bias that shortens the optimization trajectory, thereby accelerating gradient-based training. These effects pre-condition the network with a stable kernel from the outset. Empirically, we demonstrate that deep networks trained on Fourier-transformed inputs converge more rapidly and consistently achieve strong final performance, often with fewer epochs and less hyperparameter tuning. Our findings establish random Fourier pre-processing as a theoretically motivated, plug-and-play enhancement for tabular deep learning.

Problem

Research questions and friction points this paper is trying to address.

Improves deep learning on tabular data using Fourier features

Addresses training instability via NTK-guided feature transformation

Accelerates convergence without additional learnable parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Fourier features for tabular DL

Parameter-free architecture-agnostic transformation

Accelerates training via NTK spectrum conditioning

🔎 Similar Papers

Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later