eSapiens: A Platform for Secure and Auditable Retrieval-Augmented Generation

๐Ÿ“… 2025-07-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Enterprises face significant challenges in deploying LLMsโ€”including poor governance of proprietary data, weak knowledge retention, inadequate workflow integration, and lack of auditability. To address these, this paper proposes THOR, a secure and auditable Retrieval-Augmented Generation (RAG) platform. Methodologically, THOR introduces a unified THOR Agent for structured database querying, integrates hybrid vector retrieval with 512-token optimized chunking, and enables no-code AI pipeline orchestration via LangChain, supporting multi-model backends (e.g., OpenAI, Claude, Gemini, DeepSeek). Its key contributions are: (1) the first RAG architecture embedding native structured-query capability, enabling traceable, cross-modal (text + database) reasoning; and (2) end-to-end auditability via fine-grained execution logging and immutable chain-of-execution recording. Experiments on legal corpora show THOR achieves 91.3% Top-3 retrieval accuracy and improves TRACe factual alignment by 23%, significantly enhancing generation consistency and trustworthiness.

Technology Category

Application Category

๐Ÿ“ Abstract
We present eSapiens, an AI-as-a-Service (AIaaS) platform engineered around a business-oriented trifecta: proprietary data, operational workflows, and any major agnostic Large Language Model (LLM). eSapiens gives businesses full control over their AI assets, keeping everything in-house for AI knowledge retention and data security. eSapiens AI Agents (Sapiens) empower your team by providing valuable insights and automating repetitive tasks, enabling them to focus on high-impact work and drive better business outcomes. The system integrates structured document ingestion, hybrid vector retrieval, and no-code orchestration via LangChain, and supports top LLMs including OpenAI, Claude, Gemini, and DeepSeek. A key component is the THOR Agent, which handles structured SQL-style queries and generates actionable insights over enterprise databases. To evaluate the system, we conduct two experiments. First, a retrieval benchmark on legal corpora reveals that a chunk size of 512 tokens yields the highest retrieval precision (Top-3 accuracy: 91.3%). Second, a generation quality test using TRACe metrics across five LLMs shows that eSapiens delivers more context-consistent outputs with up to a 23% improvement in factual alignment. These results demonstrate the effectiveness of eSapiens in enabling trustworthy, auditable AI workflows for high-stakes domains like legal and finance.
Problem

Research questions and friction points this paper is trying to address.

Secure AI platform for proprietary data control
Automates tasks with retrieval-augmented generation insights
Ensures auditable AI workflows for high-stakes domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid vector retrieval for precise data access
No-code orchestration via LangChain integration
THOR Agent for SQL-style query handling
๐Ÿ”Ž Similar Papers
No similar papers found.
I
Isaac Shi
eSapiens Team
Z
Zeyuan Li
eSapiens Team
F
Fan Liu
eSapiens Team
W
Wenli Wang
eSapiens Team
Lewei He
Lewei He
South China Normal University
3D PrintingDeep Learning
Y
Yang Yang
eSapiens Team
Tianyu Shi
Tianyu Shi
University of Toronto
Reinforcement learningIntelligent Transportation SystemLarge Language ModelsAILLM agent