Surrogate Benchmarks for Model Merging Optimization

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hyperparameter optimization for large language model (LLM) ensembling incurs prohibitive computational costs and hinders efficient evaluation of optimization algorithms. Method: We propose a lightweight surrogate benchmark that defines a multidimensional hyperparameter search space, collects a small number of real ensembling experiments, and constructs a regression-based surrogate model to accurately predict ensemble performance across hyperparameter configurations. Contribution/Results: Compared to direct tuning, our surrogate reduces evaluation cost by over two orders of magnitude while faithfully reproducing the convergence behavior and relative ranking of optimization algorithms. Experiments across multiple LLM ensembling tasks demonstrate high predictive accuracy (average MAE < 0.8%) and strong generalization. The benchmark provides an efficient, reproducible, and low-cost standardized testbed for developing, comparing, and deploying hyperparameter optimization algorithms for LLM ensembling.

Technology Category

Application Category

📝 Abstract
Model merging techniques aim to integrate the abilities of multiple models into a single model. Most model merging techniques have hyperparameters, and their setting affects the performance of the merged model. Because several existing works show that tuning hyperparameters in model merging can enhance the merging outcome, developing hyperparameter optimization algorithms for model merging is a promising direction. However, its optimization process is computationally expensive, particularly in merging LLMs. In this work, we develop surrogate benchmarks for optimization of the merging hyperparameters to realize algorithm development and performance comparison at low cost. We define two search spaces and collect data samples to construct surrogate models to predict the performance of a merged model from a hyperparameter. We demonstrate that our benchmarks can predict the performance of merged models well and simulate optimization algorithm behaviors.
Problem

Research questions and friction points this paper is trying to address.

Optimizing hyperparameters for model merging techniques
Reducing computational costs in merging large language models
Developing surrogate benchmarks for performance prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Surrogate benchmarks for hyperparameter optimization
Predict merged model performance from hyperparameters
Low-cost algorithm development and comparison
R
Rio Akizuki
Yokohama National University
Y
Yuya Kudo
Yokohama National University
Nozomu Yoshinari
Nozomu Yoshinari
Yokohama National University
Y
Yoichi Hirose
Yokohama National University
T
Toshiyuki Nishimoto
Yokohama National University
Kento Uchida
Kento Uchida
Kyoto University
nonlinear optics
Shinichi Shirakawa
Shinichi Shirakawa
Yokohama National University
Evolutionary ComputationMachine LearningAutoMLComputer Vision