SciClaimEval: Cross-modal Claim Verification in Scientific Papers

📅 2026-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of high-quality, realistic, and diverse multimodal datasets containing refuted claims for scientific claim verification. To bridge this gap, the authors introduce SciClaimEval, a novel multimodal dataset comprising 1,664 samples spanning machine learning, natural language processing, and medical domains. SciClaimEval uniquely generates counterexamples by altering figures in original scientific papers—rather than modifying textual claims—and provides evidence in multiple formats, including images, LaTeX, HTML, and JSON. The dataset is constructed through expert annotation and a rigorous multimodal processing pipeline. Comprehensive benchmarking across eleven open- and closed-source multimodal models reveals that current systems still fall significantly short of human performance on chart-based fact verification tasks.

Technology Category

Application Category

📝 Abstract
We present SciClaimEval, a new scientific dataset for the claim verification task. Unlike existing resources, SciClaimEval features authentic claims, including refuted ones, directly extracted from published papers. To create refuted claims, we introduce a novel approach that modifies the supporting evidence (figures and tables), rather than altering the claims or relying on large language models (LLMs) to fabricate contradictions. The dataset provides cross-modal evidence with diverse representations: figures are available as images, while tables are provided in multiple formats, including images, LaTeX source, HTML, and JSON. SciClaimEval contains 1,664 annotated samples from 180 papers across three domains, machine learning, natural language processing, and medicine, validated through expert annotation. We benchmark 11 multimodal foundation models, both open-source and proprietary, across the dataset. Results show that figure-based verification remains particularly challenging for all models, as a substantial performance gap remains between the best system and human baseline.
Problem

Research questions and friction points this paper is trying to address.

claim verification
scientific papers
cross-modal
multimodal evidence
refuted claims
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-modal claim verification
scientific claim evaluation
multimodal evidence
refuted claims generation
SciClaimEval
🔎 Similar Papers
No similar papers found.
X
Xanh Ho
National Institute of Informatics, Japan
Y
Yun-Ang Wu
National Taiwan University; NII LLMC, Japan
S
Sunisth Kumar
The University of Tokyo, Japan
T
Tian Cheng Xia
University of Bologna, Italy
Florian Boudin
Florian Boudin
Associate Professor, LS2N - Nantes Université and JFLI - National Institute of Informatics / Tokyo
Natural Language ProcessingInformation RetrievalComputational Linguistics
A
André Greiner-Petter
National Institute of Informatics, Japan; University of Göttingen, Germany
A
Akiko Aizawa
National Institute of Informatics, Japan; NII LLMC, Japan; The University of Tokyo, Japan