Improving Retrieval-Augmented Deep Assertion Generation via Joint Training

📅 2025-02-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing assertion generation methods decouple retrieval from generation, preventing joint optimization and limiting both accuracy and efficiency. This paper proposes AG-RAG, the first retrieval-generation joint training framework that unifies semantic-driven Test Assertion Pair (TAP) retrieval with precise assertion synthesis. Built upon CodeT5, AG-RAG integrates a code-aware dense retriever with a Retrieval-Augmented Generation (RAG) architecture, guided by the “plastic surgery hypothesis” to coordinate retrieval and editing. It introduces an end-to-end joint fine-tuning objective. Evaluated on two comprehensive benchmarks, AG-RAG outperforms six state-of-the-art (SOTA) methods across all metrics. Against the latest baseline EditAS, it achieves absolute accuracy gains of 20.82–26.98%. Moreover, AG-RAG generates 1,739 unique correct assertions missed by all baselines—3.45× more than EditAS—demonstrating superior coverage and robustness.

Technology Category

Application Category

📝 Abstract
Unit testing attempts to validate the correctness of basic units of the software system under test and has a crucial role in software development and testing. Very recent work proposes a retrieve-and-edit approach to generate unit test oracles, i.e., assertions. Despite being promising, it is still far from perfect due to some limitations, such as splitting assertion retrieval and generation into two separate components without benefiting each other. In this paper, we propose AG-RAG, a retrieval-augmented automated assertion generation approach that leverages external codebases and joint training to address various technical limitations of prior work. Inspired by the plastic surgery hypothesis, AG-RAG attempts to combine relevant unit tests and advanced pre-trained language models (PLMs) with retrieval-augmented fine-tuning. AG-RAG builds a dense retriever to search for relevant test-assert pairs (TAPs) with semantic matching and a retrieval-augmented generator to synthesize accurate assertions with the focal-test and retrieved TAPs as input. Besides, AG-RAG leverages a code-aware language model CodeT5 as the cornerstone to facilitate both assertion retrieval and generation tasks. Furthermore, the retriever is optimized in conjunction with the generator as a whole pipeline with a joint training strategy. This unified design fully adapts both components specifically for retrieving more useful TAPs, thereby generating accurate assertions. We extensively evaluate AG-RAG against six state-of-the-art AG approaches on two benchmarks and three metrics. Experimental results show that AG-RAG significantly outperforms previous AG approaches on all benchmarks and metrics, e.g., improving the most recent baseline EditAS by 20.82% and 26.98% in terms of accuracy. AG-RAG also correctly generates 1739 and 2866 unique assertions that all baselines fail to generate, 3.45X and 9.20X more than EditAS.
Problem

Research questions and friction points this paper is trying to address.

Improving unit test assertion generation
Joint training for retrieval and generation
Leveraging external codebases for accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint training enhances retrieval and generation
Dense retriever with semantic matching improves accuracy
CodeT5 model optimizes assertion synthesis process
Quanjun Zhang
Quanjun Zhang
Nanjing University of Science and Technology
Software EngineeringSoftware TestingAutomated Program Repair
Chunrong Fang
Chunrong Fang
Software Institute, Nanjing University
Software TestingSoftware EngineeringComputer Science
Y
Yi Zheng
State Key Laboratory for Novel Software Technology, Nanjing University, China
Ruixiang Qian
Ruixiang Qian
Nanjing University
FuzzingSoftware TestingProgram Analysis
Shengcheng Yu
Shengcheng Yu
Technical University of Munich
Software EngineeringSoftware TestingGUI AutomationGUI TestingMobile App Testing
Yuan Zhao
Yuan Zhao
Lanzhou University of Technology
time series forecasting
Jianyi Zhou
Jianyi Zhou
Peking University
Software Testing
Y
Yun Yang
Department of Computing Technologies, Swinburne University of Technology, Melbourne, VIC 3122, Australia
T
Tao Zheng
State Key Laboratory for Novel Software Technology, Nanjing University, China
Z
Zhenyu Chen
State Key Laboratory for Novel Software Technology, Nanjing University, China