🤖 AI Summary
This work addresses the challenge of high-dimensional hyperparameter optimization (HPO), where numerous variables—most with negligible influence—lead to poor sample efficiency and slow convergence in conventional methods. To overcome this, the authors propose Greedy Importance First (GIF), a novel strategy that introduces importance-aware scheduling into high-dimensional HPO for the first time. GIF estimates hyperparameter importance via a small warm-up sample, groups hyperparameters by importance, and allocates optimization resources proportionally while retaining a full-space fallback mechanism to balance exploration and exploitation. Designed as a plug-and-play enhancement, GIF seamlessly integrates with existing optimizers and demonstrates significant improvements over TPE, BOHB, random search, and Sequential Grouping on high-dimensional benchmarks, achieving faster convergence and superior performance, while remaining competitive even in low-dimensional settings.
📝 Abstract
Hyperparameter Optimization (HPO) is essential for building high-performing ML/DL models, yet conventional optimizers often struggle in high-dimensional spaces where evaluations are costly and progress is diluted across many low-impact variables. We propose Greedy Importance First (GIF), an importance-aware scheduling strategy that uses a small-sample warm start to estimate hyperparameter importance, forms importance-based groups, allocates trials proportionally, and retains a full-space fallback. We evaluate GIF under fixed evaluation budgets on five anisotropic analytic functions, Bayesmark, and NAS-Bench-301. On the higher-dimensional benchmarks, GIF reaches better incumbents with faster convergence than TPE, BOHB, Random Search, and Sequential Grouping. On Bayesmark, where the effective dimensionality is smaller, GIF remains competitive but the margins are smaller. Ablation studies show that importance estimation, proportional allocation, and the fallback step all contribute to the gains. We also verify that the HIA component recovers the intended anisotropy on the analytic benchmarks. These results suggest that GIF is a simple and plug-compatible way to improve sample efficiency in high-dimensional HPO.