Reinforcement Learning for Self-Improving Agent with Skill Library

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of poor continual adaptation in new environments and inconsistent skill implementation due to prompt-dependent skill libraries in LLM-based agents, this paper proposes SAGE (Skill-Augmented GRPO Evolution), a reinforcement learning–driven self-evolving agent framework. Its core contributions are: (1) a Sequential Rollout mechanism enabling dynamic skill generation, validation, and reuse within chained task sequences; (2) a Skill-integrated Reward function that closes the loop for skill evolution; and (3) hybrid supervision combining expert demonstrations and LLM-based fine-tuning to enhance skill generalizability. Evaluated on the AppWorld benchmark, SAGE achieves an 8.9% absolute improvement in goal completion rate, reduces interaction steps by 26%, and cuts generated tokens by 59% compared to strong baselines—demonstrating substantial gains in both accuracy and reasoning efficiency.

Technology Category

Application Category

📝 Abstract
Large Language Model (LLM)-based agents have demonstrated remarkable capabilities in complex reasoning and multi-turn interactions but struggle to continuously improve and adapt when deployed in new environments. One promising approach is implementing skill libraries that allow agents to learn, validate, and apply new skills. However, current skill library approaches rely primarily on LLM prompting, making consistent skill library implementation challenging. To overcome these challenges, we propose a Reinforcement Learning (RL)-based approach to enhance agents' self-improvement capabilities with a skill library. Specifically, we introduce Skill Augmented GRPO for self-Evolution (SAGE), a novel RL framework that systematically incorporates skills into learning. The framework's key component, Sequential Rollout, iteratively deploys agents across a chain of similar tasks for each rollout. As agents navigate through the task chain, skills generated from previous tasks accumulate in the library and become available for subsequent tasks. Additionally, the framework enhances skill generation and utilization through a Skill-integrated Reward that complements the original outcome-based rewards. Experimental results on AppWorld demonstrate that SAGE, when applied to supervised-finetuned model with expert experience, achieves 8.9% higher Scenario Goal Completion while requiring 26% fewer interaction steps and generating 59% fewer tokens, substantially outperforming existing approaches in both accuracy and efficiency.
Problem

Research questions and friction points this paper is trying to address.

Enhance agent self-improvement with skill libraries
Address skill library implementation challenges via RL
Improve accuracy and efficiency in new environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement Learning framework for skill library enhancement
Sequential Rollout accumulates skills across similar task chains
Skill-integrated Reward improves skill generation and utilization
🔎 Similar Papers
No similar papers found.
J
Jiongxiao Wang
University of Wisconsin–Madison
Q
Qiaojing Yan
AWS Agentic AI
Y
Yawei Wang
AWS Agentic AI
Yijun Tian
Yijun Tian
Amazon AWS AI Lab
Large Language ModelsGraph Machine Learning
S
Soumya Smruti Mishra
AWS Agentic AI
Zhichao Xu
Zhichao Xu
Amazon AWS, University of Utah
natural language processinginformation retrieval
M
Megha Gandhi
AWS Agentic AI
Panpan Xu
Panpan Xu
Principal Applied Scientist, AWS AI/ML
Lin Lee Cheong
Lin Lee Cheong
Amazon AWS