🤖 AI Summary
To address low query efficiency and poor cross-algorithm knowledge transfer in joint optimization of machine learning algorithms and their hyperparameters, this paper proposes a multi-task Bayesian optimization framework based on a shared latent space. Methodologically, it integrates multi-task Gaussian processes with variational latent space modeling. Key contributions include: (1) a novel differentiable embedding mechanism that maps heterogeneous hyperparameter spaces into a unified latent space; (2) an adversarial regularization pretraining strategy to enhance the robustness of latent representations; and (3) a data-adaptive latent-space ranking model enabling dynamic selection of optimal embeddings. Evaluated on multiple OpenML datasets, the method reduces average evaluation count by 56.5% (2.3× speedup) and improves validation accuracy by 3.7–9.2% under fixed budget, demonstrating significantly enhanced cross-task knowledge transfer capability.
📝 Abstract
Selecting the optimal combination of a machine learning (ML) algorithm and its hyper-parameters is crucial for the development of high-performance ML systems. However, since the combination of ML algorithms and hyper-parameters is enormous, the exhaustive validation requires a significant amount of time. Many existing studies use Bayesian optimization (BO) for accelerating the search. On the other hand, a significant difficulty is that, in general, there exists a different hyper-parameter space for each one of candidate ML algorithms. BO-based approaches typically build a surrogate model independently for each hyper-parameter space, by which sufficient observations are required for all candidate ML algorithms. In this study, our proposed method embeds different hyper-parameter spaces into a shared latent space, in which a surrogate multi-task model for BO is estimated. This approach can share information of observations from different ML algorithms by which efficient optimization is expected with a smaller number of total observations. We further propose the pre-training of the latent space embedding with an adversarial regularization, and a ranking model for selecting an effective pre-trained embedding for a given target dataset. Our empirical study demonstrates effectiveness of the proposed method through datasets from OpenML.