Evaluating Classical Software Process Models as Coordination Mechanisms for LLM-Based Software Generation

πŸ“… 2025-09-17
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing research lacks systematic investigation into how traditional software process models govern collaboration among LLM-based multi-agent systems (MAS) for automated software development. Method: This study pioneers the use of waterfall, V-model, and agile paradigms as coordination frameworks for LLM-driven MAS. Three process-specific MAS architectures were implemented using GPT-series models and evaluated under uniform experimental conditions using standardized metrics for code quality, generation overhead, and productivity. Contribution/Results: Empirical results reveal critical trade-offs: the waterfall model achieves highest efficiency but lowest adaptability; the V-model incurs redundant code generation; and the agile model delivers superior correctness and maintainability at substantially higher computational cost. This work establishes the first systematic empirical foundation and coordination design paradigm for generative-AI–enabled software engineering, elucidating how process models mediate performance, scalability, and maintainability in LLM-MAS automation.

Technology Category

Application Category

πŸ“ Abstract
[Background] Large Language Model (LLM)-based multi-agent systems (MAS) are transforming software development by enabling autonomous collaboration. Classical software processes such asWaterfall, V-Model, and Agile offer structured coordination patterns that can be repurposed to guide these agent interactions. [Aims] This study explores how traditional software development processes can be adapted as coordination scaffolds for LLM based MAS and examines their impact on code quality, cost, and productivity. [Method] We executed 11 diverse software projects under three process models and four GPT variants, totaling 132 runs. Each output was evaluated using standardized metrics for size (files, LOC), cost (execution time, token usage), and quality (code smells, AI- and human detected bugs). [Results] Both process model and LLM choice significantly affected system performance. Waterfall was most efficient, V-Model produced the most verbose code, and Agile achieved the highest code quality, albeit at higher computational cost. [Conclusions] Classical software processes can be effectively instantiated in LLM-based MAS, but each entails trade-offs across quality, cost, and adaptability. Process selection should reflect project goals, whether prioritizing efficiency, robustness, or structured validation.
Problem

Research questions and friction points this paper is trying to address.

Adapting classical software processes for LLM-based multi-agent coordination
Evaluating impact on code quality, cost, and productivity metrics
Comparing Waterfall, V-Model, and Agile performance trade-offs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapting classical software process models
Evaluating coordination mechanisms for LLM agents
Comparing Waterfall V-Model Agile performance
πŸ”Ž Similar Papers
No similar papers found.
D
Duc Minh Ha
Faculty of Computer Science and Engineering, Ho Chi Minh City University of Technology (HCMUT), Vietnam
P
Phu Trac Kien
FPT University, Vietnam
Tho Quan
Tho Quan
Unknown affiliation
Anh Nguyen-Duc
Anh Nguyen-Duc
Professor, University of South Eastern Norway, Norwegian University of Science and
Software EngineeringGenerative AI