🤖 AI Summary
AI-specific code smells—patterns causing irreproducibility, silent failures, or poor generalization—are prevalent in AI systems, yet existing detection tools suffer from low coverage. Method: This paper introduces the first systematic taxonomy of 22 AI code smells and proposes a domain-specific language (DSL)-based declarative rule modeling framework for scalable, high-precision static analysis. Built upon this framework, we develop SpecDetect4AI, an automated detection tool. Contribution/Results: Evaluated on 826 real-world AI projects (20 million lines of code), SpecDetect4AI achieves 88.66% precision and 88.89% recall. User studies yield a satisfaction score of 81.7/100. Our work establishes the first reusable, extensible static analysis paradigm for ensuring AI software reliability.
📝 Abstract
The rise of Artificial Intelligence (AI) is reshaping how software systems are developed and maintained. However, AI-based systems give rise to new software issues that existing detection tools often miss. Among these, we focus on AI-specific code smells, recurring patterns in the code that may indicate deeper problems such as unreproducibility, silent failures, or poor model generalization. We introduce SpecDetect4AI, a tool-based approach for the specification and detection of these code smells at scale. This approach combines a high-level declarative Domain-Specific Language (DSL) for rule specification with an extensible static analysis tool that interprets and detects these rules for AI-based systems. We specified 22 AI-specific code smells and evaluated SpecDetect4AI on 826 AI-based systems (20M lines of code), achieving a precision of 88.66% and a recall of 88.89%, outperforming other existing detection tools. Our results show that SpecDetect4AI supports the specification and detection of AI-specific code smells through dedicated rules and can effectively analyze large AI-based systems, demonstrating both efficiency and extensibility (SUS 81.7/100).