🤖 AI Summary
A systematic investigation of design challenges for large language model (LLM)-driven multi-agent systems (MAS) in software engineering (SE) remains absent. This paper presents the first empirical, SE-focused analysis of LLM-MAS design, grounded in a qualitative synthesis of 94 scholarly works. We identify 12 quality attributes—functional suitability being the most critical—and 16 design patterns, with role coordination as the most prevalent; code generation quality emerges as the primary design driver. Innovatively, we propose a “Role–Task–Capability” collaborative modeling framework that leverages LLMs’ reasoning and planning capabilities to construct reusable, purpose-built MAS architectures for SE tasks. Our findings provide both theoretical foundations and actionable engineering guidelines for designing LLM-MAS in SE contexts, demonstrably enhancing generated code quality and system functionality. (138 words)
📝 Abstract
As the complexity of Software Engineering (SE) tasks continues to escalate, Multi-Agent Systems (MASs) have emerged as a focal point of research and practice due to their autonomy and scalability. Furthermore, through leveraging the reasoning and planning capabilities of Large Language Models (LLMs), the application of LLM-based MASs in the field of SE is garnering increasing attention. However, there is no dedicated study that systematically explores the design of LLM-based MASs, including the Quality Attributes (QAs) on which the designers mainly focus, the design patterns used by the designers, and the rationale guiding the design of LLM-based MASs for SE tasks. To this end, we conducted a study to identify the QAs that LLM-based MASs for SE tasks focus on, the design patterns used in the MASs, and the design rationale for the MASs. We collected 94 papers on LLM-based MASs for SE tasks as the source. Our study shows that: (1) Code Generation is the most common SE task solved by LLM-based MASs among ten identified SE tasks, (2) Functional Suitability is the QA on which designers of LLM-based MASs pay the most attention, (3) Role-Based Cooperation is the design pattern most frequently employed among 16 patterns used to construct LLM-based MASs, and (4) Improving the Quality of Generated Code is the most common rationale behind the design of LLM-based MASs. Based on the study results, we presented the implications for the design of LLM-based MASs to support SE tasks.