LLMs Meet Library Evolution: Evaluating Deprecated API Usage in LLM-based Code Completion

📅 2024-06-14
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This paper presents the first systematic empirical study addressing the misuse of deprecated APIs by large language models (LLMs) in code completion. Leveraging seven state-of-the-art LLMs, 145 deprecated-to-current API mappings across eight popular Python libraries, and over 28,000 code completion samples, we establish a multidimensional evaluation framework—spanning models, prompts, and libraries—and integrate deprecation annotation with root-cause attribution analysis. We find an average deprecated API invocation rate of 12.7%, primarily attributable to model version lag, documentation bias, and insufficient contextual awareness. To mitigate this, we propose two lightweight interventions: REPLACEAPI (targeted API substitution) and INSERTPROMPT (context-aware prompt augmentation), establishing a new baseline for co-evolution of LLMs and software libraries. Experimental results demonstrate up to a 39.2% reduction in deprecated API usage.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs), pre-trained or fine-tuned on large code corpora, have shown effectiveness in generating code completions. However, in LLM-based code completion, LLMs may struggle to use correct and up-to-date Application Programming Interfaces (APIs) due to the rapid and continuous evolution of libraries. While existing studies have highlighted issues with predicting incorrect APIs, the specific problem of deprecated API usage in LLM-based code completion has not been thoroughly investigated. To address this gap, we conducted the first evaluation study on deprecated API usage in LLM-based code completion. This study involved seven advanced LLMs, 145 API mappings from eight popular Python libraries, and 28,125 completion prompts. The study results reveal the status quo (i.e., API usage plausibility and deprecated usage rate) of deprecated API and replacing API usage in LLM-based code completion from the perspectives of model, prompt, and library, and indicate the root causes behind. Based on these findings, we propose two lightweight fixing approaches, REPLACEAPI and INSERTPROMPT, which can serve as baseline approaches for future research on mitigating deprecated API usage in LLM-based completion. Additionally, we provide implications for future research on integrating library evolution with LLM-driven software development.
Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with deprecated API usage.
Evaluate deprecated API usage in code completion.
Propose fixes for LLM-based API usage errors.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates deprecated API usage
Proposes REPLACEAPI and INSERTPROMPT
Integrates library evolution with LLMs
🔎 Similar Papers
No similar papers found.
C
Chong Wang
School of Computer Science and Engineering, Nanyang Technological University, Singapore
Kaifeng Huang
Kaifeng Huang
Tongji Univerisity
OSS Supply ChainSoftware Engineering
J
Jian Zhang
School of Computer Science and Engineering, Nanyang Technological University, Singapore
Yebo Feng
Yebo Feng
Nanyang Technological University
Computer SecurityNetwork SecurityBlockchain SecurityNetwork Traffic Analysis
Lyuye Zhang
Lyuye Zhang
Postdoc, Nanyang Technological University
Program AnalysisOpen sourceOpen source securitySoftware supply chainSoftware maintenace
Y
Yang Liu
School of Computer Science and Engineering, Nanyang Technological University, Singapore
Xin Peng
Xin Peng
East China University of Science and Technology
Artificial IntelligenceMachine LearningComplex Process Modeling