🤖 AI Summary
Existing automated issue assignment methods are limited by their reliance on large amounts of project-specific data or sparse relational information. This work proposes the first application of a supervised fine-tuned large language model (DeepSeek-R1-Distill-Llama-8B) to this task, leveraging only issue titles and descriptions to directly generate ranked developer recommendations through end-to-end semantic understanding—without requiring any structured relational data. The proposed approach substantially improves assignment accuracy, achieving up to a 187.8% relative improvement over baseline methods in Hit@1 and outperforming four state-of-the-art approaches by as much as 211.2%. These results demonstrate the strong transfer capability of large language models in software engineering tasks.
📝 Abstract
Issue assignment is a critical process in software maintenance, where new issue reports are validated and assigned to suitable developers. However, manual issue assignment is often inconsistent and error-prone, especially in large open-source projects where thousands of new issues are reported monthly. Existing automated approaches have shown promise, but many rely heavily on large volumes of project-specific training data or relational information that is often sparse and noisy, which limits their effectiveness. To address these challenges, we propose LIA (LLM-based Issue Assignment), which employs supervised fine-tuning to adapt an LLM, DeepSeek-R1-Distill-Llama-8B in this work, for automatic issue assignment. By leveraging the LLM's pretrained semantic understanding of natural language and software-related text, LIA learns to generate ranked developer recommendations directly from issue titles and descriptions. The ranking is based on the model's learned understanding of historical issue-to-developer assignments, using patterns from past tasks to infer which developers are most likely to handle new issues. Through comprehensive evaluation, we show that LIA delivers substantial improvements over both its base pretrained model and state-of-the-art baselines. It achieves up to +187.8% higher Hit@1 compared to the DeepSeek-R1-Distill-Llama-8B pretrained base model, and outperforms four leading issue assignment methods by as much as +211.2% in Hit@1 score. These results highlight the effectiveness of domain-adapted LLMs for software maintenance tasks and establish LIA as a practical, high-performing solution for issue assignment.