Static Program Analysis Guided LLM Based Unit Test Generation

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current LLM-based Java unit test generation approaches face two key bottlenecks: (1) the absence of invocation examples renders prompting ineffective, and (2) excessive class context exceeds LLM context window limits. This paper proposes StaticPrompt—a novel paradigm integrating static program analysis with prompt engineering. Its core innovation lies in extracting lightweight, semantically precise contextual information—such as control-flow paths and cross-method data dependencies—directly from bytecode, and structuring this information for injection into LLM prompts, thereby eliminating redundant source-code input. Experiments across large-scale industrial and open-source Java projects demonstrate that StaticPrompt significantly improves branch coverage (+21.3%) and test pass rate (+34.7%). Moreover, it exhibits stable robustness on highly complex methods and ultra-long-context scenarios, effectively extending the generalization boundary of pure-LLM approaches.

Technology Category

Application Category

📝 Abstract
We describe a novel approach to automating unit test generation for Java methods using large language models (LLMs). Existing LLM-based approaches rely on sample usage(s) of the method to test (focal method) and/or provide the entire class of the focal method as input prompt and context. The former approach is often not viable due to the lack of sample usages, especially for newly written focal methods. The latter approach does not scale well enough; the bigger the complexity of the focal method and larger associated class, the harder it is to produce adequate test code (due to factors such as exceeding the prompt and context lengths of the underlying LLM). We show that augmenting prompts with emph{concise} and emph{precise} context information obtained by program analysis %of the focal method increases the effectiveness of generating unit test code through LLMs. We validate our approach on a large commercial Java project and a popular open-source Java project.
Problem

Research questions and friction points this paper is trying to address.

Automating unit test generation for Java methods using LLMs.
Overcoming limitations of existing LLM-based test generation approaches.
Enhancing test generation with concise program analysis-derived context.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses static program analysis for context
Enhances LLM prompts with concise data
Validated on commercial and open-source projects
🔎 Similar Papers
No similar papers found.
S
Sujoy Roychowdhury
Ericsson R&D, Bangalore, India
G
G. Sridhara
Ericsson R&D, Bangalore, India
A
A. K. Raghavan
Independent Researcher, Chennai, India
Joy Bose
Joy Bose
Senior Data Scientist at Ericsson, previously in Samsung, Microsoft, Embibe
Machine learningspiking neural networksEEG/BCIlarge language modelsdeep learning
S
Sourav Mazumdar
Ericsson R&D, Bangalore, India
H
Hamender Singh
Ericsson R&D, Bangalore, India
S
Srinivasan Bajji Sugumaran
Ericsson R&D, Bangalore, India
Ricardo Britto
Ricardo Britto
Ericsson / Blekinge Institute of Technology
Software process improvementMachine LearningSearch-basedsoftware engineering