SkillPager: Query-Adaptive Intra-Skill Navigation via Semantic Node Retrieval

📅 2026-05-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

184K/year
🤖 AI Summary
This work addresses the challenge of contextual redundancy and dilution of critical information when skill-based LLM agents utilize lengthy procedural documents via full-document prompting. To mitigate this, the authors propose SkillPager, a novel framework that introduces typed semantic granularity into skill document retrieval. SkillPager first parses Markdown documentation offline into structured semantic nodes and then employs query-adaptive Maximal Marginal Relevance (MMR) retrieval online to select the minimal yet execution-sufficient context. Evaluated on a benchmark comprising 395 skills and 1,975 queries, SkillPager achieves 78.89% contextual adequacy—only 3.34% lower than the full-document baseline—while reducing prompt tokens by 47.04%. It also outperforms the strongest graph-based baseline by 12.16%, demonstrating the efficacy and necessity of structured semantic nodes for skill-oriented agents.
📝 Abstract
Skill-based LLM agents increasingly rely on long procedural documents, but full-document prompting wastes tokens and dilutes information critical to execution. We study this setting as intra-skill retrieval, where the goal is to select a minimal, execution-sufficient context from a known skill document given a query. We present SkillPager, a two-stage framework that parses each Markdown skill into typed semantic nodes offline and leverages Maximal Marginal Relevance (MMR) to perform global, query-conditioned node selection online. On a benchmark of 395 skills and 1,975 queries, SkillPager achieves 78.89% LLM-judged context sufficiency, compared to 82.23% for the exhaustive full-document baseline, while reducing prompt tokens by 47.04%. A granularity ablation shows that applying the same retrieval algorithm to raw fixed-length chunks reaches a comparable 81.77% sufficiency but increases token cost by 28.81%, demonstrating that efficiency gains are driven by typed semantic granularity rather than the retrieval algorithm alone. Among graph-based baselines, SkillPager outperforms the strongest baseline by a margin of 12.16%. Further ablations show that supporting content is most effective when retained in the candidate pool and selected adaptively rather than removed by static heuristics. These results identify typed intra-document retrieval as a distinct access problem for skill-based agents.
Problem

Research questions and friction points this paper is trying to address.

intra-skill retrieval
semantic node retrieval
skill-based agents
context sufficiency
typed granularity
Innovation

Methods, ideas, or system contributions that make the work stand out.

intra-skill retrieval
semantic node parsing
typed granularity
query-adaptive context selection
Maximal Marginal Relevance
🔎 Similar Papers
No similar papers found.