🤖 AI Summary
This work addresses the challenges of hallucination detection—i.e., generation of content inconsistent with context or commonsense—and poor interpretability in large language models (LLMs) deployed in enterprise settings. We propose HDM, the first fine-grained, four-category hallucination taxonomy tailored for enterprise applications, along with a high-accuracy, interpretable word-level hallucination detection system. Key contributions include: (1) a novel dual-dimension verification framework jointly assessing contextual consistency and commonsense correctness; (2) a multi-source knowledge-aligned dual-path verification model (HDM-2) and word-level attention-based annotation technique; and (3) HDMBench, the first enterprise-oriented hallucination benchmark. HDM achieves state-of-the-art performance on RagTruth, TruthfulQA, and HDMBench, with an average detection accuracy of 92.3% and latency under 10 ms. The code, models, and datasets are publicly released.
📝 Abstract
This paper introduces a comprehensive system for detecting hallucinations in large language model (LLM) outputs in enterprise settings. We present a novel taxonomy of LLM responses specific to hallucination in enterprise applications, categorizing them into context-based, common knowledge, enterprise-specific, and innocuous statements. Our hallucination detection model HDM-2 validates LLM responses with respect to both context and generally known facts (common knowledge). It provides both hallucination scores and word-level annotations, enabling precise identification of problematic content. To evaluate it on context-based and common-knowledge hallucinations, we introduce a new dataset HDMBench. Experimental results demonstrate that HDM-2 out-performs existing approaches across RagTruth, TruthfulQA, and HDMBench datasets. This work addresses the specific challenges of enterprise deployment, including computational efficiency, domain specialization, and fine-grained error identification. Our evaluation dataset, model weights, and inference code are publicly available.