Towards Effective Extraction and Evaluation of Factual Claims

📅 2025-02-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the critical problem of inaccurate and incomplete claim extraction from long texts generated by large language models (LLMs), which undermines downstream fact-checking reliability. To tackle this, we propose the first standardized evaluation framework for claim extraction. Methodologically, we design Claimify—a high-confidence-driven, fuzziness-aware extraction model that generates claims only when semantic clarity is assured—and develop an automated evaluation pipeline incorporating two novel quantitative metrics: coverage and decontextualization, complemented by semantic consistency verification for reproducible assessment. Our contributions are threefold: (1) an open-source, unified evaluation framework enabling fair cross-method comparison; (2) Claimify’s statistically significant outperformance over state-of-the-art baselines across multiple metrics; and (3) empirical evidence demonstrating that high-confidence constraints substantially improve claim verifiability and fact-checking accuracy.

Technology Category

Application Category

📝 Abstract
A common strategy for fact-checking long-form content generated by Large Language Models (LLMs) is extracting simple claims that can be verified independently. Since inaccurate or incomplete claims compromise fact-checking results, ensuring claim quality is critical. However, the lack of a standardized evaluation framework impedes assessment and comparison of claim extraction methods. To address this gap, we propose a framework for evaluating claim extraction in the context of fact-checking along with automated, scalable, and replicable methods for applying this framework, including novel approaches for measuring coverage and decontextualization. We also introduce Claimify, an LLM-based claim extraction method, and demonstrate that it outperforms existing methods under our evaluation framework. A key feature of Claimify is its ability to handle ambiguity and extract claims only when there is high confidence in the correct interpretation of the source text.
Problem

Research questions and friction points this paper is trying to address.

Evaluating claim extraction methods
Ensuring high claim quality
Handling ambiguity in claim extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Standardized framework for claim evaluation
Automated methods for scalable claim extraction
LLM-based method with high confidence handling
🔎 Similar Papers
No similar papers found.
D
Dasha Metropolitansky
Microsoft Research
Jonathan Larson
Jonathan Larson
Microsoft Research
network machine learning