SkillGuard: A Permission Framework for Agent Skills

📅 2026-06-01

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses the lack of effective oversight in existing agent skill ecosystems, where mismatches between declared skill intents and runtime behaviors introduce significant security and privacy risks. To bridge this gap, we propose SkillGuard—the first skill-centric permission framework that treats skills as executable units carrying explicit permissions. SkillGuard enforces consistency between intent and behavior through a dual-plane governance model that jointly regulates contextual influence and action side effects. It systematically integrates a suite of mechanisms, including declarative manifests, runtime access control, user-mediated authorization, default-deny policies, capability inference, and behavioral monitoring. Evaluation on 315 real-world skills demonstrates that SkillGuard covers 99.76% of protected objects in its permission taxonomy, achieves a 91.0% F1 score in automated manifest generation, substantially reduces the success rate of injection attacks, and preserves utility for benign tasks.

📝 Abstract

Agent skills extend LLM agents with reusable instructions, scripts, tool bindings, and contextual dependencies. However, current skill ecosystems largely rely on trust-based loading and static inspection, leaving a gap between what a skill can inject into an agent's context and what it can cause the agent to do at runtime. This gap introduces new security and privacy risks, and existing defenses primarily inspect skill files statically or regulate individual tool calls, without systematically connecting a skill's declared intent with its runtime behavior. In this paper, we present SkillGuard, a skill-centric permission framework that treats skills as permission-bearing executable artifacts. SkillGuard introduces a dual-plane governance model that jointly regulates context influence and action side effects through skill manifests, runtime access control, user-mediated authorization, deny-by-default enforcement, capability inference, and behavior monitoring. We evaluate SkillGuard on 315 real-world skills and SkillInject. The permission taxonomy covers 99.76% of observed protected objects, and automated manifest generation reaches 91.0% F1. In adversarial evaluations, SkillGuard reduces attack success from 32.37% to 23.02% for contextual injections and from 25.56% to 16.67% for obvious injections, while maintaining benign task utility. These results suggest that SkillGuard, as a skill-centric permission framework, can provide a practical foundation for improving the privacy and security of agent skill ecosystems.

Problem

Research questions and friction points this paper is trying to address.

agent skills

security

privacy

permission framework

runtime behavior

Innovation

Methods, ideas, or system contributions that make the work stand out.

permission framework

agent skills

runtime behavior monitoring