Pneuma-Seeker: A Relational Reification Mechanism to Align AI Agents with Human Work over Relational Data

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

This work addresses the tendency of existing systems to generate hallucinations or erroneous responses when user intent is ambiguous. The authors propose a relational reification mechanism that iteratively transforms dynamic, vague information needs into shared relational schemas, enabling the discovery and integration of heterogeneous data sources to synthesize executable programs that accurately answer queries. By integrating a large language model agent architecture with imperative planning, the approach co-evolves analysis-ready data models under coordinated macro- and micro-context management, substantially enhancing human-system alignment and system verifiability. Experimental results demonstrate that the proposed system outperforms state-of-the-art academic and industrial baselines across multiple domains, with real-world deployments confirming its advantages in accuracy, trustworthiness, and inspectability.

Technology Category

Application Category

📝 Abstract

When faced with data problems, many data workers cannot articulate their information need precisely enough for software to help. Although LLMs interpret natural-language requests, they behave brittly when intent is under-specified, e.g., hallucinating fields, assuming join paths, or producing ungrounded answers. We present Pneuma-Seeker, a system built around a central idea: relational reification. Pneuma-Seeker represents a user's evolving information need as a relational schema: a concrete, analysis-ready data model shared between user and system. Rather than answering prompts directly, Pneuma-Seeker iteratively refines this schema, then discovers and prepares relevant sources to construct a relation and executable program that compute the answer. Pneuma-Seeker employs an LLM-powered agentic architecture with conductor-style planning and macro- and micro-level context management to operate effectively over heterogeneous relational corpora. We evaluate Pneuma-Seeker across multiple domains against state-of-the-art academic and industrial baselines, demonstrating higher answer accuracy. Deployment in a real organization highlights trust and inspectability as essential requirements for LLM-mediated data systems.

Problem

Research questions and friction points this paper is trying to address.

relational data

information need

natural-language requests

LLM hallucination

data workers

Innovation

Methods, ideas, or system contributions that make the work stand out.

relational reification

LLM-powered agentic architecture

iterative schema refinement