Multi-Label Requirements Classification with Large Taxonomies

📅 2024-06-07
🏛️ IEEE International Requirements Engineering Conference
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenge of multi-label automatic annotation for large-scale, hierarchical classification systems in software requirements engineering, this study proposes a sentence-level zero-shot classification paradigm to circumvent the high annotation costs associated with supervised training. We introduce the first industrial-scale requirements annotation benchmark comprising 769 taxonomy labels and systematically demonstrate a strong negative correlation between the number of taxonomy leaf nodes and classification recall. We further propose a zero-shot multi-label classification method leveraging SBERT sentence embeddings, achieving significant improvements in recall. Empirical evaluation reveals that hierarchical strategies yield no consistent performance gain across settings. Our work validates the effectiveness and feasibility of zero-shot learning for large-scale requirements classification, offering a scalable, low-human-effort automation solution for requirements tracing. (138 words)

Technology Category

Application Category

📝 Abstract
Context and motivation: Classification aids software development activities by organizing requirements in classes for easier access and retrieval. The majority of requirements classification research has, so far, focused on binary or multi-class classification. Question/problem: Multi-label classification with large taxonomies could aid requirements traceability but is prohibitively costly with supervised training. Hence, we investigate zero-short learning to evaluate the feasibility of multi-label requirements classification with large taxonomies. Principal ideas/results: We associated, together with domain experts from the industry, 129 requirements with 769 labels from taxonomies ranging between 250 and 1183 classes. Then, we conducted a controlled experiment to study the impact of the type of classifier, the hierarchy, and the structural characteristics of taxonomies on the classification performance. The results show that: (1) The sentence-based classifier had a significantly higher recall compared to the word-based classifier; however, the precision and F1-score did not improve significantly. (2) The hierarchical classification strategy did not always improve the performance of requirements classification. (3) The total and leaf nodes of the taxonomies have a strong negative correlation with the recall of the hierarchical sentence-based classifier. Contribution: We investigate the problem of multi-label requirements classification with large taxonomies, illustrate a systematic process to create a ground truth involving industry participants, and provide an analysis of different classification pipelines using zero-shot learning.
Problem

Research questions and friction points this paper is trying to address.

Investigates multi-label classification for software requirements with large taxonomies
Evaluates zero-shot learning feasibility for cost-effective multi-label classification
Analyzes classifier types and taxonomy structures impact on classification performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot learning for multi-label classification
Hierarchical sentence-based classifier evaluation
Industry collaboration for ground truth creation
🔎 Similar Papers
No similar papers found.
Waleed Abdeen
Waleed Abdeen
PhD Candidate @ SERL, BTH
Requirements TraceabilityMulti-Label classificaitonAI-for-SESoftware Engineering
M
M. Unterkalmsteiner
Blekinge Institute of Technology, Karlskrona, Sweden
Krzysztof Wnuk
Krzysztof Wnuk
Software Engineering, Blekinge Institute of Technology (BTH), SERL-Sweden
Software EngineeringSoftware BusinessOpen InnovationProduct ManagementRequirements Engineering
A
Alexandros Chirtoglou
HOCHTIEF ViCon GmbH, Essen, Germany
C
Christoph Schimanski
HOCHTIEF ViCon GmbH, Essen, Germany
H
Heja Goli
HOCHTIEF ViCon GmbH, Essen, Germany