🤖 AI Summary
To address low testing efficiency, insufficient defect detection rates, and limited coverage in large-scale software development, this paper proposes a test-oriented AI Copilot framework. Methodologically, it introduces a test-scenario-specific dynamic context-aware retrieval-augmented generation (RAG) mechanism, integrating LLM fine-tuning, code semantic graph construction, and incremental test-knowledge retrieval to enable real-time co-execution of test case generation and defect detection—synchronized with code evolution. Crucially, it models code generation and defect detection as symbiotic tasks sharing a common “low-defect” objective—the first such formulation. Experimental results demonstrate a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical-path test coverage, and a 10.5% rise in user acceptance rate, validating the framework’s effectiveness in enhancing both automated testing quality and developer productivity.
📝 Abstract
The rapid pace of large-scale software development places increasing demands on traditional testing methodologies, often leading to bottlenecks in efficiency, accuracy, and coverage. We propose a novel perspective on software testing by positing bug detection and coding with fewer bugs as two interconnected problems that share a common goal, which is reducing bugs with limited resources. We extend our previous work on AI-assisted programming, which supports code auto-completion and chatbot-powered Q&A, to the realm of software testing. We introduce Copilot for Testing, an automated testing system that synchronizes bug detection with codebase updates, leveraging context-based Retrieval Augmented Generation (RAG) to enhance the capabilities of large language models (LLMs). Our evaluation demonstrates a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical test coverage, and a 10.5% higher user acceptance rate, highlighting the transformative potential of AI-driven technologies in modern software development practices.