eSapiens: A Platform for Secure and Auditable Retrieval-Augmented Generation

📅 2025-07-13

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Enterprises face significant challenges in deploying LLMs—including poor governance of proprietary data, weak knowledge retention, inadequate workflow integration, and lack of auditability. To address these, this paper proposes THOR, a secure and auditable Retrieval-Augmented Generation (RAG) platform. Methodologically, THOR introduces a unified THOR Agent for structured database querying, integrates hybrid vector retrieval with 512-token optimized chunking, and enables no-code AI pipeline orchestration via LangChain, supporting multi-model backends (e.g., OpenAI, Claude, Gemini, DeepSeek). Its key contributions are: (1) the first RAG architecture embedding native structured-query capability, enabling traceable, cross-modal (text + database) reasoning; and (2) end-to-end auditability via fine-grained execution logging and immutable chain-of-execution recording. Experiments on legal corpora show THOR achieves 91.3% Top-3 retrieval accuracy and improves TRACe factual alignment by 23%, significantly enhancing generation consistency and trustworthiness.

Technology Category

Application Category

📝 Abstract

We present eSapiens, an AI-as-a-Service (AIaaS) platform engineered around a business-oriented trifecta: proprietary data, operational workflows, and any major agnostic Large Language Model (LLM). eSapiens gives businesses full control over their AI assets, keeping everything in-house for AI knowledge retention and data security. eSapiens AI Agents (Sapiens) empower your team by providing valuable insights and automating repetitive tasks, enabling them to focus on high-impact work and drive better business outcomes. The system integrates structured document ingestion, hybrid vector retrieval, and no-code orchestration via LangChain, and supports top LLMs including OpenAI, Claude, Gemini, and DeepSeek. A key component is the THOR Agent, which handles structured SQL-style queries and generates actionable insights over enterprise databases. To evaluate the system, we conduct two experiments. First, a retrieval benchmark on legal corpora reveals that a chunk size of 512 tokens yields the highest retrieval precision (Top-3 accuracy: 91.3%). Second, a generation quality test using TRACe metrics across five LLMs shows that eSapiens delivers more context-consistent outputs with up to a 23% improvement in factual alignment. These results demonstrate the effectiveness of eSapiens in enabling trustworthy, auditable AI workflows for high-stakes domains like legal and finance.

Problem

Research questions and friction points this paper is trying to address.

Secure AI platform for proprietary data control

Automates tasks with retrieval-augmented generation insights

Ensures auditable AI workflows for high-stakes domains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid vector retrieval for precise data access

No-code orchestration via LangChain integration

THOR Agent for SQL-style query handling

🔎 Similar Papers

Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data