Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that skills reused by large language model agents may exhibit latent malicious behaviors under specific runtime conditions, which are difficult to detect through static analysis alone. To tackle this issue, the paper proposes Runtime Skill Auditing (RSA), a novel approach that introduces risk-interface-oriented dynamic probes combined with context-aware test generation, dynamic program analysis, and behavioral trajectory evaluation to accurately identify potential threats in real execution environments. Experimental evaluation on the OpenClaw platform demonstrates that RSA achieves 90.0% accuracy, 88.0% true positive rate, and 8.0% false positive rate across 100 skills—outperforming the best static method by 13 percentage points—and successfully detects all 20 malicious skills involved in self-evolving attacks.
📝 Abstract
Agent skills let LLM agents reuse instructions, resources, tools, and workflows, but they also create a new place for malicious behavior to hide. A skill may look benign in its documentation or code while becoming harmful only when it is invoked with particular user requests, local assets, persistent state, or multi-step tool interactions. This makes purely static vetting brittle. We present Runtime Skill Audit (RSA), a dynamic analysis method that audits skills by asking what the skill-mediated agent actually does under targeted runtime conditions. Instead of testing every skill with the same generic tasks, RSA profiles risk-relevant interfaces, prepares the execution context needed to exercise them, and assigns security labels from the resulting trace evidence. We instantiate RSA on OpenClaw and evaluate it on 100 skills against representative static baselines. RSA achieves 90.0\% accuracy with an 88.0\% true positive rate and an 8.0\% false positive rate, improving accuracy by 13.0 percentage points over the best static baseline. Under self-evolving attacks, static detectors collapse after one or two rounds, while RSA continues to detect 19--20 out of 20 malicious skills across rounds.
Problem

Research questions and friction points this paper is trying to address.

agent skills
runtime probing
skill security
malicious behavior
dynamic analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Runtime Skill Audit
dynamic analysis
LLM agent security
skill-mediated execution
targeted runtime probing
🔎 Similar Papers
No similar papers found.