FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data

📅 2025-01-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing fact-checking models exhibit limited long-text processing capability, high computational overhead, and strong dependence on large foundation language models for document-level LLM hallucination detection. To address these limitations, this paper proposes FactCG: a novel framework that first introduces CG2C, a context-graph-to-claim synthetic data generation method, which extracts context graphs from source documents and explicitly models multi-hop reasoning paths; it then constructs a graph-enhanced fact classification model that leverages structured graph reasoning to strengthen the multi-step logical discrimination capacity of compact models. Evaluated on the LLM-Aggregate benchmark, FactCG achieves superior factual consistency classification performance over GPT-4-o—despite using orders-of-magnitude fewer parameters—marking the first such result for a lightweight model. This demonstrates that synergistic context graph modeling and multi-hop reasoning significantly enhance both robustness and efficiency in fact verification.

Technology Category

Application Category

📝 Abstract
Prior research on training grounded factuality classification models to detect hallucinations in large language models (LLMs) has relied on public natural language inference (NLI) data and synthetic data. However, conventional NLI datasets are not well-suited for document-level reasoning, which is critical for detecting LLM hallucinations. Recent approaches to document-level synthetic data generation involve iteratively removing sentences from documents and annotating factuality using LLM-based prompts. While effective, this method is computationally expensive for long documents and limited by the LLM's capabilities. In this work, we analyze the differences between existing synthetic training data used in state-of-the-art models and real LLM output claims. Based on our findings, we propose a novel approach for synthetic data generation, CG2C, that leverages multi-hop reasoning on context graphs extracted from documents. Our fact checker model, FactCG, demonstrates improved performance with more connected reasoning, using the same backbone models. Experiments show it even outperforms GPT-4-o on the LLM-Aggrefact benchmark with much smaller model size.
Problem

Research questions and friction points this paper is trying to address.

Fact-Checking
Large Language Models
Computational Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

CG2C
FactCG
Error Detection in LLMs
🔎 Similar Papers
No similar papers found.
Deren Lei
Deren Lei
Meta GenAI
Natural Language ProcessingMachine Learning
Yaxi Li
Yaxi Li
Zoom
S
Siyao Li
Microsoft Responsible AI
M
Mengya Hu
R
Rui Xu
K
Ken Archer
M
Mingyu Wang
E
Emily Ching
A
Alex Deng
Microsoft Responsible AI