🤖 AI Summary
This work proposes a unified, end-to-end approach to automated fact-checking by leveraging multi-task learning on small-scale open-source decoder-only language models (e.g., Qwen3-4B), addressing the high cost and maintenance challenges of traditional pipelines that rely on multiple specialized models. While large language models exhibit strong performance, their adoption is often hindered by closed-source licensing, complexity, and expense. The proposed method jointly optimizes claim detection, evidence retrieval, and stance classification through a combination of classification heads, causal language modeling heads, and instruction tuning. Evaluated against zero-shot and few-shot baselines, this approach achieves relative performance improvements of up to 44%, 54%, and 31% on the three subtasks, respectively, while offering a reproducible and sustainable framework for practical deployment.
📝 Abstract
Large language models (LLMs) are reshaping automated fact-checking (AFC) by enabling unified, end-to-end verification pipelines rather than isolated components. While large proprietary models achieve strong performance, their closed weights, complexity, and high costs limit sustainability. Fine-tuning smaller open weight models for individual AFC tasks can help but requires multiple specialized models resulting in high costs. We propose \textbf{multi-task learning (MTL)} as a more efficient alternative that fine-tunes a single model to perform claim detection, evidence ranking, and stance detection jointly. Using small decoder-only LLMs (e.g., Qwen3-4b), we explore three MTL strategies: classification heads, causal language modeling heads, and instruction-tuning, and evaluate them across model sizes, task orders, and standard non-LLM baselines. While multitask models do not universally surpass single-task baselines, they yield substantial improvements, achieving up to \textbf{44\%}, \textbf{54\%}, and \textbf{31\%} relative gains for claim detection, evidence re-ranking, and stance detection, respectively, over zero-/few-shot settings. Finally, we also provide practical, empirically grounded guidelines to help practitioners apply MTL with LLMs for automated fact-checking.