Specification and Detection of LLM Code Smells

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

Existing LLM inference code lacks a systematic taxonomy of code smells, hindering high-quality integration into software systems. Method: This paper introduces the novel concept of “LLM code smells,” formally defining five representative categories—including prompt hardcoding, unvalidated response handling, and context leakage—and establishes the first structured, inference-phase–specific classification catalog. We extend the SpecDetect4AI toolchain with static analysis and rule-based pattern matching to enable automated detection. Contribution/Results: Evaluated on 200 open-source LLM applications, our approach achieves a 60.50% detection coverage rate and an average precision of 86.06%. This work bridges a critical gap in the quality assurance of LLM engineering practice, providing both a theoretical foundation and a practical, automated detection capability to support secure, maintainable, and production-ready LLM integration.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have gained massive popularity in recent years and are increasingly integrated into software systems for diverse purposes. However, poorly integrating them in source code may undermine software system quality. Yet, to our knowledge, there is no formal catalog of code smells specific to coding practices for LLM inference. In this paper, we introduce the concept of LLM code smells and formalize five recurrent problematic coding practices related to LLM inference in software systems, based on relevant literature. We extend the detection tool SpecDetect4AI to cover the newly defined LLM code smells and use it to validate their prevalence in a dataset of 200 open-source LLM systems. Our results show that LLM code smells affect 60.50% of the analyzed systems, with a detection precision of 86.06%.

Problem

Research questions and friction points this paper is trying to address.

Defines LLM code smells for software quality

Detects problematic LLM integration practices in code

Validates prevalence of smells in open-source systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Defined five LLM-specific code smell patterns

Extended SpecDetect4AI tool for automated detection

Validated prevalence in 200 open-source LLM systems

🔎 Similar Papers

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study