Unravelling Technical debt topics through Time, Programming Languages and Repository

📅 2025-04-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient understanding of technical debt (TD) topic diversity and its dynamic evolution mechanisms. We propose the first spatiotemporal, multilingual, and cross-repository framework for TD topic evolution analysis. Leveraging GitHub issue data from 2015–2023, our method integrates BERTopic-based topic modeling, fine-grained sentiment analysis (VADER and Transformer-based), temporal topic tracking, and multidimensional cross-statistics to systematically characterize TD topic lifecycles, popularity shifts, and language-specific preferences—e.g., architectural debt predominates in Java, while dependency debt prevails in JavaScript. We identify 12 core TD topics and achieve 86.3% accuracy in sentiment polarity classification. Crucially, we introduce the first three-dimensional dynamic modeling of the TD topic taxonomy—spanning time, language, and repository dimensions—thereby bridging critical gaps in research on TD perception trends and structural evolution.

Technology Category

Application Category

📝 Abstract
This study explores the dynamic landscape of Technical Debt (TD) topics in software engineering by examining its evolution across time, programming languages, and repositories. Despite the extensive research on identifying and quantifying TD, there remains a significant gap in understanding the diversity of TD topics and their temporal development. To address this, we have conducted an explorative analysis of TD data extracted from GitHub issues spanning from 2015 to September 2023. We employed BERTopic for sophisticated topic modelling. This study categorises the TD topics and tracks their progression over time. Furthermore, we have incorporated sentiment analysis for each identified topic, providing a deeper insight into the perceptions and attitudes associated with these topics. This offers a more nuanced understanding of the trends and shifts in TD topics through time, programming language, and repository.
Problem

Research questions and friction points this paper is trying to address.

Exploring Technical Debt topic evolution over time and languages
Analyzing GitHub issues to categorize and track TD topics
Incorporating sentiment analysis for TD topic perceptions
Innovation

Methods, ideas, or system contributions that make the work stand out.

BERTopic for sophisticated topic modeling
Sentiment analysis on identified topics
Analyzed GitHub issues from 2015-2023
🔎 Similar Papers
No similar papers found.