🤖 AI Summary
To address low response efficiency and weak capability in handling complex requests for enterprise compliance tasks, this paper proposes a dual-modal conversational intelligent agent architecture. The architecture employs a dynamic query routing mechanism that adaptively switches between two operational modes—FastTrack (lightweight, low-latency responses) and FullAgentic (multi-step reasoning, tool invocation, and knowledge retrieval)—thereby ensuring both low end-to-end latency and high response quality. Its key innovation lies in the synergistic integration of large language models (LLMs), structured knowledge bases, API services, and interpretable reasoning chains, enabling context-aware compliance intent understanding and precise response generation. Experimental results demonstrate significant improvements: keyword matching accuracy increases from 41.7% to 83.7%, and LLM-based human evaluation pass rate rises from 20.0% to 82.0%, while end-to-end latency remains stable.
📝 Abstract
This paper presents Compliance Brain Assistant (CBA), a conversational, agentic AI assistant designed to boost the efficiency of daily compliance tasks for personnel in enterprise environments. To strike a good balance between response quality and latency, we design a user query router that can intelligently choose between (i) FastTrack mode: to handle simple requests that only need additional relevant context retrieved from knowledge corpora; and (ii) FullAgentic mode: to handle complicated requests that need composite actions and tool invocations to proactively discover context across various compliance artifacts, and/or involving other APIs/models for accommodating requests. A typical example would be to start with a user query, use its description to find a specific entity and then use the entity's information to query other APIs for curating and enriching the final AI response.
Our experimental evaluations compared CBA against an out-of-the-box LLM on various real-world privacy/compliance-related queries targeting various personas. We found that CBA substantially improved upon the vanilla LLM's performance on metrics such as average keyword match rate (83.7% vs. 41.7%) and LLM-judge pass rate (82.0% vs. 20.0%). We also compared metrics for the full routing-based design against the `fast-track only` and `full-agentic` modes and found that it had a better average match-rate and pass-rate while keeping the run-time approximately the same. This finding validated our hypothesis that the routing mechanism leads to a good trade-off between the two worlds.