Capability-Aligned Hierarchical Learning for Tool-Augmented LLMs

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing hierarchical tool-learning approaches often suffer from policy misalignment due to the independent optimization of high-level planners and low-level executors. This work proposes a capability-aligned hierarchical learning framework that introduces, for the first time, a joint optimization mechanism. Specifically, it employs Reward-Labeling with Value Regularization (RLVR) to co-train the high-level task planner and the low-level tool-invocation policy, thereby aligning their capabilities and overcoming the modular fragmentation inherent in conventional methods. Experiments on constrained benchmarks such as API-Bank and BFCL, as well as the open-ended Bamboogle environment, demonstrate that the proposed approach significantly outperforms current state-of-the-art methods, validating both the effectiveness and generalizability of capability-aligned optimization.

📝 Abstract

Tool learning enables LLMs to invoke external tools to accomplish tasks. Prior studies have demonstrated the effectiveness of a hierarchical structure: a high-level policy handles global planning and decomposes tasks into manageable sub-tasks, and a low-level policy focuses on invoking tools to solve these sub-tasks. However, these works typically optimize the high-level and low-level policies separately, leading to planner-executor misalignment and limiting LLM performance on tool-use tasks. In this paper, we propose a method called Capability-Aligned Hierarchical Learning (CAHL), which leverages RLVR to jointly optimize both policies, enabling better alignment between the high-level planner and the low-level executor. Experiments on constrained tool-use benchmarks (API-Bank and BFCL) and an open-ended environment (Bamboogle) demonstrate the effectiveness of CAHL.

Problem

Research questions and friction points this paper is trying to address.

tool learning

hierarchical learning

planner-executor misalignment

LLMs

capability alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Capability-Aligned Hierarchical Learning

tool-augmented LLMs

hierarchical policy optimization