BAMBO: Construct Ability and Efficiency LLM Pareto Set via Bayesian Adaptive Multi-objective Block-wise Optimization

📅 2025-12-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Pareto optimization of large language models (LLMs) for capability and inference efficiency remains challenging: model-level approaches yield sparse Pareto sets, while layer-level methods suffer from the curse of dimensionality due to high-dimensional search spaces. Method: We propose a block-level Pareto set construction framework. First, we introduce a novel hybrid optimal block partitioning strategy that formulates inter-layer optimization as a one-dimensional dynamic programming clustering problem. Second, we design a Bayesian multi-objective evolutionary loop based on the quasi-Expected Hypervolume Improvement (qEHVI) acquisition function to enable fully automated, high-fidelity Pareto front generation. Contribution/Results: By integrating block-wise parameter merging with multi-objective evolutionary optimization, our method significantly improves Pareto front coverage and quality across multiple LLMs. It achieves an average 32.7% hypervolume gain over state-of-the-art methods and supports agile model selection under joint constraints—including latency, GPU memory, and accuracy.

Technology Category

Application Category

📝 Abstract

Constructing a Pareto set is pivotal for navigating the capability-efficiency trade-offs in Large Language Models (LLMs); however, existing merging techniques remain inadequate for this task. Coarse-grained, model-level methods yield only a sparse set of suboptimal solutions, while fine-grained, layer-wise approaches suffer from the "curse of dimensionality," rendering the search space computationally intractable. To resolve this dichotomy, we propose BAMBO (Bayesian Adaptive Multi-objective Block-wise Optimization), a novel framework that automatically constructs the LLM Pareto set. BAMBO renders the search tractable by introducing a Hybrid Optimal Block Partitioning strategy. Formulated as a 1D clustering problem, this strategy leverages a dynamic programming approach to optimally balance intra-block homogeneity and inter-block information distribution, thereby dramatically reducing dimensionality without sacrificing critical granularity. The entire process is automated within an evolutionary loop driven by the q-Expected Hypervolume Improvement (qEHVI) acquisition function. Experiments demonstrate that BAMBO discovers a superior and more comprehensive Pareto frontier than baselines, enabling agile model selection tailored to diverse operational constraints. Code is available at: https://github.com/xin8coder/BAMBO.

Problem

Research questions and friction points this paper is trying to address.

Constructs LLM Pareto set for capability-efficiency trade-offs

Addresses dimensionality curse in fine-grained layer-wise optimization

Automates search with Bayesian adaptive multi-objective block-wise optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian adaptive multi-objective block-wise optimization framework

Hybrid optimal block partitioning via 1D clustering and dynamic programming

Automated evolutionary loop with qEHVI acquisition function

🔎 Similar Papers

No similar papers found.

Authors to Follow