ClaimIQ at CheckThat! 2025: Comparing Prompted and Fine-Tuned Language Models for Verifying Numerical Claims

📅 2025-09-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses fact-checking of numerical and temporal claims by proposing a verification framework that integrates retrieved evidence with the LLaMA large language model. Methodologically, it employs multi-granularity evidence retrieval using BM25 and MiniLM, followed by instruction tuning and parameter-efficient fine-tuning via LoRA; it systematically compares zero-shot prompting against fine-tuning strategies. The core contributions are twofold: (1) empirical identification of evidence granularity as a critical factor governing model generalization, and (2) characterization of the synergistic relationship between evidence selection quality and model adaptability. Experiments demonstrate substantial performance gains over baselines on the English validation set; however, degradation on the test set reveals two key challenges—insufficient evidence utilization and limited cross-scenario generalization—thereby providing both empirical grounding and concrete directions for advancing trustworthy fact-checking research.

Technology Category

Application Category

📝 Abstract

This paper presents our system for Task 3 of the CLEF 2025 CheckThat! Lab, which focuses on verifying numerical and temporal claims using retrieved evidence. We explore two complementary approaches: zero-shot prompting with instruction-tuned large language models (LLMs) and supervised fine-tuning using parameter-efficient LoRA. To enhance evidence quality, we investigate several selection strategies, including full-document input and top-k sentence filtering using BM25 and MiniLM. Our best-performing model LLaMA fine-tuned with LoRA achieves strong performance on the English validation set. However, a notable drop in the test set highlights a generalization challenge. These findings underscore the importance of evidence granularity and model adaptation for robust numerical fact verification.

Problem

Research questions and friction points this paper is trying to address.

Verifying numerical claims using retrieved evidence

Comparing prompted and fine-tuned language models

Investigating evidence selection strategies for verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot prompting with instruction-tuned LLMs

Supervised fine-tuning using LoRA technique

Evidence selection with BM25 and MiniLM

🔎 Similar Papers

No similar papers found.

Authors to Follow