How Requirements Quality Makes (or Breaks) Traceability Link Recovery

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of empirical analysis on how quality defects in requirements documents affect the performance of automated trace link recovery (TLR). For the first time, it systematically annotates 28 types of requirements quality defects across 189 use cases from two datasets, evaluates five state-of-the-art TLR methods, and analyzes their performance impacts using both statistical significance and effect size measures. The findings reveal that specific defect types differentially influence TLR effectiveness: while certain defects significantly degrade performance, others unexpectedly enhance it. Beyond identifying key factors that either hinder or facilitate TLR accuracy, this work demonstrates that the choice of TLR method should be strategically tailored to the quality characteristics of the underlying requirements documentation.
📝 Abstract
Traceability information between requirements and source code greatly benefits the maintenance of a software system. Since manually establishing trace links is cumbersome and error-prone, previous research explored automated traceability link recovery (TLR) approaches to support this task. However, quality defects in requirements impact subsequent activities such as TLR, yet evidence about this remains scarce. Our objective is to contribute empirical evidence on this impact. At the same time, we aim to understand how the performance of TLR approaches varies given these quality defects. To this end, we annotated 28 types of quality defect in 189 use case descriptions from two datasets. Then, we executed five distinct TLR approaches on the dataset and measured their performance in recovering trace links. Finally, we performed statistical tests to quantify the defects' effect strength on this performance. Our results show that some quality defects harm TLR performance, e.g., sentences that do not start with noun phrases, while others actually benefit performance, e.g., use cases that include implementation details. Moreover, different types of approaches respond differently to these defects. As a consequence, the performance-optimizing choice of a TLR approach depends on the quality of the dataset.
Problem

Research questions and friction points this paper is trying to address.

requirements quality
traceability link recovery
quality defects
software maintenance
empirical study
Innovation

Methods, ideas, or system contributions that make the work stand out.

traceability link recovery
requirements quality
quality defects
empirical study
software maintenance
🔎 Similar Papers
No similar papers found.