Automatically Generating Web Applications from Requirements Via Multi-Agent Test-Driven Development

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing full-stack web application generation methods support only frontend code, compromising functional completeness and end-to-end reliability. This paper introduces TDDev—the first test-driven development (TDD)-integrated, multi-agent large language model framework for fully automated, end-to-end full-stack application generation from natural language or design sketches. TDDev generates executable applications comprising frontend, backend, database schemas, and interactive logic. Its core innovation lies in a closed-loop pipeline: multimodal perception → test case derivation → collaborative code generation → human–computer interaction simulation. It automatically infers comprehensive test cases covering both functionality and UI, then iteratively refines interdependent multi-file code. Experiments demonstrate that TDDev achieves a 14.4% absolute improvement in overall accuracy over state-of-the-art methods across multiple benchmarks, marking the first solution enabling high-fidelity, high-reliability, fully automated end-to-end full-stack application generation.

Technology Category

Application Category

📝 Abstract
Developing full-stack web applications is complex and time-intensive, demanding proficiency across diverse technologies and frameworks. Although recent advances in multimodal large language models (MLLMs) enable automated webpage generation from visual inputs, current solutions remain limited to front-end tasks and fail to deliver fully functional applications. In this work, we introduce TDDev, the first test-driven development (TDD)-enabled LLM-agent framework for end-to-end full-stack web application generation. Given a natural language description or design image, TDDev automatically derives executable test cases, generates front-end and back-end code, simulates user interactions, and iteratively refines the implementation until all requirements are satisfied. Our framework addresses key challenges in full-stack automation, including underspecified user requirements, complex interdependencies among multiple files, and the need for both functional correctness and visual fidelity. Through extensive experiments on diverse application scenarios, TDDev achieves a 14.4% improvement on overall accuracy compared to state-of-the-art baselines, demonstrating its effectiveness in producing reliable, high-quality web applications without requiring manual intervention.
Problem

Research questions and friction points this paper is trying to address.

Automating full-stack web application development from requirements
Addressing underspecified user needs and complex file dependencies
Ensuring functional correctness and visual fidelity in applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework automates full-stack web development
Generates code from requirements using test-driven development
Iteratively refines implementation via simulated user interactions
🔎 Similar Papers
No similar papers found.