TICoder: A Repository-Level Code Generation Framework with Test-Driven Planning and Implementation-Aware Reuse

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenges of inadequate planning and inefficient function reuse in repository-level code generation with large language models, which stem from complex dependencies and limited context windows. To overcome these issues, the authors propose TICoder, a novel framework that integrates a test-driven iterative planning mechanism with an implementation-aware dual-perspective (functional and implementation) function retrieval strategy. TICoder further enhances behavioral consistency and reuse effectiveness through a two-stage selection pipeline combining retrieval-augmented generation, structural clustering, and perplexity-based filtering. Experimental results demonstrate that TICoder outperforms state-of-the-art methods by an average of 11.52% across multiple established repository-level code generation benchmarks, significantly improving the quality of generated code.

📝 Abstract

Repository-level code generation with Large Language Models (LLMs) remains challenging, primarily due to complex dependencies and limited context windows. Recent approaches adopt retrieval-augmented generation (RAG) and the planning mechanism to reuse potential callee functions in the repository. However, these approaches often suffer from two limitations: lack of test-driven behavioral guidance during planning and overlooking the implementation logic embedded in repository code during reuse. As a result, generated plans may not align with expected behaviors, and retrieved functions may not be effectively reused. In this paper, we propose TICoder, a novel repository-level code generation framework that improves both planning and reuse. TICoder introduces a test-driven iterative planning mechanism that leverages test cases as behavioral specifications to refine implementation steps. Furthermore, TICoder employs an implementation-aware code reuse strategy, which retrieves potential callee functions using a dual-view similarity that captures both functional and implementation aspects. We then identify relevant usage patterns through a dual-stage selection strategy, combining structure-based clustering and perplexity-based filtering. We conduct extensive experiments on widely used repository-level code generation benchmarks with various LLMs. Experimental results demonstrate that TICoder outperforms state-of-the-art (SOTA) methods, achieving an average improvement of 11.52%.

Problem

Research questions and friction points this paper is trying to address.

repository-level code generation

test-driven planning

implementation-aware reuse

code generation with LLMs

callee function reuse

Innovation

Methods, ideas, or system contributions that make the work stand out.

test-driven planning

implementation-aware reuse

repository-level code generation