Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation

πŸ“… 2026-05-29
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

155K/year
πŸ€– AI Summary
This work addresses the tendency of existing large language model (LLM)-driven Text-to-SQL agents to over-explore in fine-grained API environments, often incorporating irrelevant schema elements and thereby degrading SQL generation accuracy. To mitigate this issue, the paper introduces a novel mechanism that embeds guiding instructions directly within data system API responses to actively regulate the agent’s exploration behavior. This approach enhances exploration efficiency while preserving query precision. Notably, it represents the first method to leverage API response instructions for constraining excessive exploration. Experimental results demonstrate that the proposed guiding instructions reduce over-exploration by a factor of 4.6 and improve SQL generation accuracy by up to 12.4% (approximately 4 percentage points).
πŸ“ Abstract
Text2SQL agents powered by LLMs translate natural language intent into SQL by exploring the data system through tool calls before formulating the query. However, to ensure secure and scoped access, data systems construct environments with explicit API surfaces. We study and categorize these APIs exposed today as either coarse-grained or fine-grained and posit that choosing between them presents a fundamental tradeoff between cost-efficient exploration and accurate SQL generation. Most data systems expose fine-grained APIs, but this inadvertently disadvantages agents: they over-explore, incorporating irrelevant schema elements into their query formulation and produce inaccurate results. We argue that curbing over-exploration is key to the effective use of these API surfaces, and propose Sophrosyne, a data system environment that augments API responses with directives that guide the agent's exploration process. Initial results show that directives reduce over-exploration by 4.6x and boost accuracy by up to 12.4% (approx. 4 percentage points).
Problem

Research questions and friction points this paper is trying to address.

Text2SQL
LLM agents
relational data systems
over-exploration
API granularity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text2SQL
LLM agents
API moderation
over-exploration mitigation
relational data systems
πŸ”Ž Similar Papers
No similar papers found.