What Should a Skill Remember? Quality-Cost Trade-offs in Cost-Aware Skill Rewriting for Language Model Agents

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing skill rewriting approaches often reduce the problem to prompt compression, overlooking their impact on agent execution cost and task quality. This work reframes skill design as cost-aware operational knowledge engineering, introducing a controllable framework to analyze skill structure and perform rewriting based on information retention strategies. Using the SkillsBench benchmark, the study systematically evaluates diverse anchoring strategies—such as API/code, workflows, and rules/formulas—under fixed tasks, environments, and validators, thereby refuting the existence of a universal template. Experimental results demonstrate that the proposed method reduces total cost by 7.0% and downstream token cost by 6.0% in primary evaluation; under frozen cross-model transfer, average reductions reach 14.7% and 13.7%, respectively, while preserving validation quality.

📝 Abstract

Large language model agents increasingly rely on skills: reusable procedural documents encoding workflows, tool use, implementation patterns, validation checks, and domain rules. Skill rewriting is often treated as prompt compression, but shorter skills can make agents more expensive by removing sparse operational anchors that prevent exploration, debugging, and recovery. We study skill rewriting through this economic lens. Our controlled framework profiles skill structure, rewrites skills using information-preservation strategies, and evaluates the rewrites under fixed task instructions, environments, and verifiers. Experiments on SkillsBench reveal distinct quality--cost trade-offs across strategies: API/code anchoring, workflow guarding, and rule/formula anchoring benefit different task families, with no universally dominant template. In the main held-out evaluation, the learned policy reduces total cost by 7.0\% and downstream agent-token cost by 6.0\%; in frozen cross-model transfer, the corresponding reductions average 14.7\% and 13.7\%, while verifier quality is preserved. These results position skill design as cost-aware operational knowledge engineering rather than prompt compression. Resources: \href{https://github.com/1Reminding/Skill_EE}{SkillEE}.

Problem

Research questions and friction points this paper is trying to address.

skill rewriting

quality-cost trade-offs

cost-aware

language model agents

operational knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

cost-aware skill rewriting

quality-cost trade-off

operational knowledge engineering