ClaimIQ at CheckThat! 2025: Comparing Prompted and Fine-Tuned Language Models for Verifying Numerical Claims

📅 2025-09-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses fact-checking of numerical and temporal claims by proposing a verification framework that integrates retrieved evidence with the LLaMA large language model. Methodologically, it employs multi-granularity evidence retrieval using BM25 and MiniLM, followed by instruction tuning and parameter-efficient fine-tuning via LoRA; it systematically compares zero-shot prompting against fine-tuning strategies. The core contributions are twofold: (1) empirical identification of evidence granularity as a critical factor governing model generalization, and (2) characterization of the synergistic relationship between evidence selection quality and model adaptability. Experiments demonstrate substantial performance gains over baselines on the English validation set; however, degradation on the test set reveals two key challenges—insufficient evidence utilization and limited cross-scenario generalization—thereby providing both empirical grounding and concrete directions for advancing trustworthy fact-checking research.

Technology Category

Application Category

📝 Abstract
This paper presents our system for Task 3 of the CLEF 2025 CheckThat! Lab, which focuses on verifying numerical and temporal claims using retrieved evidence. We explore two complementary approaches: zero-shot prompting with instruction-tuned large language models (LLMs) and supervised fine-tuning using parameter-efficient LoRA. To enhance evidence quality, we investigate several selection strategies, including full-document input and top-k sentence filtering using BM25 and MiniLM. Our best-performing model LLaMA fine-tuned with LoRA achieves strong performance on the English validation set. However, a notable drop in the test set highlights a generalization challenge. These findings underscore the importance of evidence granularity and model adaptation for robust numerical fact verification.
Problem

Research questions and friction points this paper is trying to address.

Verifying numerical claims using retrieved evidence
Comparing prompted and fine-tuned language models
Investigating evidence selection strategies for verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot prompting with instruction-tuned LLMs
Supervised fine-tuning using LoRA technique
Evidence selection with BM25 and MiniLM
🔎 Similar Papers
No similar papers found.
A
Anirban Saha Anik
Department of Data Science, University of North Texas, Denton, TX, USA
M
Md Fahimul Kabir Chowdhury
Department of Computer Science and Engineering, University of North Texas, Denton, TX, USA
A
Andrew Wyckoff
Department of Computer Science and Engineering, University of North Texas, Denton, TX, USA
Sagnik Ray Choudhury
Sagnik Ray Choudhury
University of North Texas
digital libraryNLPexplainabilityinformation extraction