🤖 AI Summary
Deep learning (DL) code refactoring lacks systematic investigation, and existing IDEs and refactoring tools lack support for DL-specific semantics—such as tensor operations and automatic differentiation. Method: We conduct the first large-scale empirical study, analyzing 4,921 refactoring commits across five mainstream DL projects (e.g., PyTorch) and surveying 159 practitioners. Using manual commit analysis, experience mining, and cross-project statistical comparison, we characterize DL refactoring patterns and tooling gaps. Contribution/Results: We find that DL refactoring predominantly targets model architecture and data pipeline adjustments—differing significantly from traditional Java software in type distribution. Current tools universally lack DL semantic awareness. Based on these findings, we propose design principles for DL-aware refactoring tools, emphasizing tensor dependency modeling and computational graph awareness. We further formulate a practical, actionable roadmap for integrating these capabilities into next-generation DL development environments.
📝 Abstract
With the rapid development of deep learning, the implementation of intricate algorithms and substantial data processing have become standard elements of deep learning projects. As a result, the code has become progressively complex as the software evolves, which is difficult to maintain and understand. Existing studies have investigated the impact of refactoring on software quality within traditional software. However, the insight of code refactoring in the context of deep learning is still unclear. This study endeavors to fill this knowledge gap by empirically examining the current state of code refactoring in deep learning realm, and practitioners' views on refactoring. We first manually analyzed the commit history of five popular and well-maintained deep learning projects (e.g., PyTorch). We mined 4,921 refactoring practices in historical commits and measured how different types and elements of refactoring operations are distributed and found that refactoring operation types' distribution in deep learning projects is different from it in traditional Java software. We then surveyed 159 practitioners about their views of code refactoring in deep learning projects and their expectations of current refactoring tools. The result of the survey showed that refactoring research and the development of related tools in the field of deep learning are crucial for improving project maintainability and code quality, and that current refactoring tools do not adequately meet the needs of practitioners. Lastly, we provided our perspective on the future advancement of refactoring tools and offered suggestions for developers' development practices.