Self-Reflective APIs: Structure Beats Verbosity for AI Agent Recovery

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

181K/year
🤖 AI Summary
This work addresses the limited ability of current AI agents to autonomously recover from API call failures due to the absence of actionable repair guidance. The authors propose an introspective API mechanism that, upon validation failure, returns structured, machine-readable suggestions enabling agents to correct and retry requests without additional reasoning. By replacing natural language error messages with structured JSON feedback, the approach facilitates direct agent adaptation. Through adversarial task evaluations, answer-leakage auditing tools, and multi-model comparative experiments, the study uncovers two subtle forms of answer leakage prevalent in contemporary LLM benchmarks. Evaluated on Anthropic models, the method improves task completion rates by 36.7–40.0 percentage points and enhances token efficiency per successful task by 1.8–2.2×, with results validated in real-world billed API scenarios.
📝 Abstract
When an AI agent calls an API and hits a validation error, it needs more than what went wrong -- it needs what to do next. A self-reflective API returns, on validation failure, a machine-readable recovery\_feedback.suggestions[] payload sufficient for the agent to repair the request and retry without external reasoning. On a leak-audited pilot ($N{=}30$ per cell, 3 LLMs, 10 adversarial tasks), structured suggestions lift task-completion rate by $+36.7$--$40.0$pp over plain-English diagnoses on Anthropic models (Fisher's exact $p \le 0.0022$), at $1.8$--$2.2\times$ better per-success token efficiency. The lift is not significant on gpt-4o-mini ($p{=}0.435$); a second-domain replication on a billing API confirms the pattern. The comparison only holds after auditing two undocumented classes of answer leakage in LLM benchmarks. We shipaudit\_prompt\_leakage.py as reusable CI infrastructure. Code and data: https://github.com/arquicanedo/self-reflective-apis.
Problem

Research questions and friction points this paper is trying to address.

AI agent recovery
API validation error
self-reflective API
structured feedback
task completion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Reflective APIs
structured recovery feedback
AI agent resilience
validation error handling
answer leakage auditing
🔎 Similar Papers
No similar papers found.