🤖 AI Summary
Addressing the poor reproducibility and high environmental heterogeneity of interdisciplinary computational experiments, this paper proposes SciRep—a framework that unifies management of code, data, programming languages, dependencies, and execution commands via containerized encapsulation, declarative experiment specifications (YAML/JSON), dependency snapshotting, and deterministic scheduling. It produces lightweight, portable “capsule packages.” SciRep introduces the first domain-agnostic reproducibility packaging paradigm, embodying “configuration-as-documentation” and “execution-as-verification,” and achieves end-to-end reproducible workflows across heterogeneous domains—including medicine, bioinformatics, and computer science—for the first time. It supports multi-language ecosystems (e.g., Python, R, Julia) on Linux/macOS. Empirical evaluation successfully reproduced 16 of 18 published experiments (89%), significantly surpassing the best prior tool’s 61% reproducibility rate; all successfully executed experiments reproduced original results with 100% fidelity.
📝 Abstract
In recent years, the research community, but also the general public, has raised serious questions about the reproducibility and replicability of scientific work. Since many studies include some kind of computational work, these issues are also a technological challenge, not only in computer science, but also in most research domains. Computational replicability and reproducibility are not easy to achieve due to the variety of computational environments that can be used. Indeed, it is challenging to recreate the same environment via the same frameworks, code, programming languages, dependencies, and so on. We propose a framework, known as SciRep, that supports the configuration, execution, and packaging of computational experiments by defining their code, data, programming languages, dependencies, databases, and commands to be executed. After the initial configuration, the experiments can be executed any number of times, always producing exactly the same results. Our approach allows the creation of a reproducibility package for experiments from multiple scientific fields, from medicine to computer science, which can be re-executed on any computer. The produced package acts as a capsule, holding absolutely everything necessary to re-execute the experiment. To evaluate our framework, we compare it with three state-of-the-art tools and use it to reproduce 18 experiments extracted from published scientific articles. With our approach, we were able to execute 16 (89%) of those experiments, while the others reached only 61%, thus showing that our approach is effective. Moreover, all the experiments that were executed produced the results presented in the original publication. Thus, SciRep was able to reproduce 100% of the experiments it could run.