🤖 AI Summary
This study addresses the pervasive yet underexplored impact of technical debt in scientific software, which compromises its reliability, maintainability, and scientific validity. Through a mixed-methods approach combining code comment mining—analyzing 28,000 comments across nine research software projects—and in-depth interviews with researchers, the work identifies nine distinct types of self-admitted technical debt specific to scientific software, along with four overarching thematic impacts. This research provides the first systematic empirical characterization of how technical debt manifests and originates in scientific contexts, filling a critical gap in the literature. The findings offer foundational insights for improving the quality, sustainability, and trustworthiness of research software.
📝 Abstract
Research software (also called scientific software) is essential for advancing scientific endeavours. Research software encapsulates complex algorithms and domain-specific knowledge and is a fundamental component of all science. A pervasive challenge in developing research software is technical debt, which can adversely affect reliability, maintainability, and scientific validity. Research software often relies on the initiative of the scientific community for maintenance, requiring diverse expertise in both scientific and software engineering domains. The extent and nature of technical debt in research software are little studied, in particular, what forms it takes, and what the science teams developing this software think about their technical debt. In this paper we describe our multi-method study examining technical debt in research software. We begin by examining instances of self-reported technical debt in research code, examining 28k code comments across nine research software projects. Then, building on our findings, we interview research software engineers and scientists about how this technical debt manifests itself in their experience, and what costs it has for research software and research outputs more generally. We identify nine types of self-admitted technical debt unique to research software, and four themes impacting this technical debt.