Identifier Name Similarities: An Exploratory Study

📅 2025-07-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses identifier name similarity-induced naming confusion and its adverse effects on code comprehension, maintainability, and developer collaboration. To address the lack of a systematic classification framework in prior work, we propose the first taxonomy of identifier name similarity, spanning semantic, orthographic, and contextual dimensions—designed for both theoretical rigor and practical scalability. Through empirical analysis of naming patterns across large-scale open-source projects, we identify six high-frequency similarity categories (e.g., spelling variants, abbreviation conflicts, semantic near-synonyms) and empirically validate their prevalence and detrimental impact in real-world codebases. The taxonomy provides a reusable theoretical foundation and methodological support for identifier naming quality assessment, static analysis tool design, and collaborative naming convention development.

Technology Category

Application Category

📝 Abstract
Identifier names, which comprise a significant portion of the codebase, are the cornerstone of effective program comprehension. However, research has shown that poorly chosen names can significantly increase cognitive load and hinder collaboration. Even names that appear readable in isolation may lead to misunderstandings in contexts when they closely resemble other names in either structure or functionality. In this exploratory study, we present our preliminary findings on the occurrence of identifier name similarity in software projects through the development of a taxonomy that categorizes different forms of identifier name similarity. We envision our initial taxonomy providing researchers with a platform to analyze and evaluate the impact of identifier name similarity on code comprehension, maintainability, and collaboration among developers, while also allowing for further refinement and expansion of the taxonomy.
Problem

Research questions and friction points this paper is trying to address.

Explores identifier name similarity in software projects
Analyzes impact on code comprehension and maintainability
Develops taxonomy to categorize name similarity types
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed a taxonomy for identifier name similarity
Analyzed impact on code comprehension and maintainability
Enabled further refinement of similarity categories
🔎 Similar Papers
No similar papers found.
C
Carol Wong
University of Hawai‘i at M¯anoa, Honolulu, Hawai‘i, USA
M
Mai Abe
University of Hawai‘i at M¯anoa, Honolulu, Hawai‘i, USA
S
Silvia De Benedictis
University of Hawai‘i at M¯anoa, Honolulu, Hawai‘i, USA
M
Marissa Halim
University of Hawai‘i at M¯anoa, Honolulu, Hawai‘i, USA
Anthony Peruma
Anthony Peruma
University of Hawai‘i at Mānoa
Program ComprehensionSoftware RefactoringSoftware MaintenanceSoftware Evolution