Use as Directed? A Comparison of Software Tools Intended to Check Rigor and Transparency of Published Work

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The reproducibility crisis in scientific research stems partly from insufficient reporting transparency and inconsistent adherence to standardized practices. This study systematically evaluates 11 automated tools against ScreenIT’s nine rigor criteria—including open data availability, explicit inclusion/exclusion criteria disclosure, and preregistration—assessing their capacity to detect compliance. Results reveal that individual tools exhibit limited sensitivity, whereas ensemble approaches significantly improve overall detection rates; notably, one tool demonstrates superior performance specifically in identifying open data statements. This work presents the first cross-platform empirical validation of tool collaboration as an effective strategy for rigor assessment. Based on these findings, we propose concrete, evidence-based directions for tool development and refinement. All code and datasets are publicly released, providing a fully reproducible methodology and practical guidance to advance research transparency and reproducibility. (149 words)

Technology Category

Application Category

📝 Abstract
The causes of the reproducibility crisis include lack of standardization and transparency in scientific reporting. Checklists such as ARRIVE and CONSORT seek to improve transparency, but they are not always followed by authors and peer review often fails to identify missing items. To address these issues, there are several automated tools that have been designed to check different rigor criteria. We have conducted a broad comparison of 11 automated tools across 9 different rigor criteria from the ScreenIT group. We found some criteria, including detecting open data, where the combination of tools showed a clear winner, a tool which performed much better than other tools. In other cases, including detection of inclusion and exclusion criteria, the combination of tools exceeded the performance of any one tool. We also identified key areas where tool developers should focus their effort to make their tool maximally useful. We conclude with a set of insights and recommendations for stakeholders in the development of rigor and transparency detection tools. The code and data for the study is available at https://github.com/PeterEckmann1/tool-comparison.
Problem

Research questions and friction points this paper is trying to address.

Comparing automated tools for checking scientific rigor criteria
Evaluating tool performance in detecting transparency and open data
Identifying gaps and recommendations for improving rigor detection tools
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated tools check rigor criteria
Combined tools outperform single tools
Identified key areas for tool improvement
🔎 Similar Papers
No similar papers found.
P
Peter Eckmann
Department of Computer Science and Engineering, UC San Diego, La Jolla, CA, United States
A
Adrian Barnett
School of Public Health and Social Work, Queensland University of Technology, Kelvin Grove, Australia
A
Alexandra Bannach-Brown
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
E
Elisa Pilar Bascunan Atria
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
Guillaume Cabanac
Guillaume Cabanac
Professor of Computer Science, University of Toulouse & Institut Universitaire de France
Scientific Literature MiningResearch IntegrityScientometricsSleuthing
L
Louise Delwen Owen Franzen
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
M
Małgorzata Anna Gazda
Department of Biological Sciences, University of Montréal, 1375 Avenue Thérèse-Lavoie-Roux, H3C 3J7 Montréal, Québec, Canada
K
Kaitlyn Hair
UCL Social Research Institute, University College London, London
James Howison
James Howison
Professor, University of Texas at Austin
Computer Supported Cooperative WorkCyberinfrastructureOpen Source SoftwareScientific Software
Halil Kilicoglu
Halil Kilicoglu
Associate Professor, University of Illinois at Urbana-Champaign
Natural Language ProcessingInformation ExtractionBiomedical InformaticsComputational SemanticsQuestion Answering
C
Cyril Labbe
Université Grenoble Alpes, France
S
Sarah McCann
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
V
Vladislav Nachev
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
M
Martijn Roelandse
martijnroelandse.dev, Ouderkerk aan de Amstel, Netherlands
M
Maia Salholz-Hillel
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
Robert Schulz
Robert Schulz
QUEST Center for Responsible Research, Berlin Institute of Health at Charité Universitätsmedizin Berlin, Germany
G
Gerben ter Riet
Hogeschool van Amsterdam, Amsterdam University of Applied Sciences, Amsterdam, Netherlands
Colby Vorland
Colby Vorland
Assistant Research Scientist, Indiana University
Nutritionagingmeta-researchresearch integrityautomation
A
Anita Bandrowski
Department of Neuroscience, UC San Diego, La Jolla, CA, United States; SciCrunch Inc.
Tracey Weissgerber
Tracey Weissgerber
QUEST (Quality | Ethics | Open Science | Translation), Berlin Institute of Health, Charité
meta-researchdata visualizationstatisticspreeclampsiavascular physiology