Development and Evolution of Xtext-based DSLs on GitHub: An Empirical Investigation

📅 2025-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The lack of systematic understanding of domain-specific language (DSL) evolution hinders the advancement of model-driven engineering (MDE) methods and tooling. Method: This study conducts the first large-scale empirical analysis of 1,002 Xtext-based textual DSL projects on GitHub, identifying 226 mature DSLs spanning 18 application domains. We propose a hybrid methodology integrating GitHub API mining, manual classification, and DSL metamodel analysis to quantify grammar coverage and characterize change types. Contribution/Results: We find that DSLs in domains such as Data Management exhibit broad adoption, rapid evolution, and long lifespans; grammar-driven development is the dominant paradigm, with Xtext frequently employed for refactoring existing languages. Among the 722 projects containing grammar definitions, only 33% provide textual examples, yet over 60% of grammar rules are empirically observed in use. Evolution is predominantly perfective—aimed at enhancing functionality and maintainability. The study delivers the first open-source DSL evolution dataset annotated with rich metadata, providing an empirical foundation for DSL engineering practice and tool development.

Technology Category

Application Category

📝 Abstract
Domain-specific languages (DSLs) play a crucial role in facilitating a wide range of software development activities in the context of model-driven engineering (MDE). However, a systematic understanding of their evolution is lacking, which hinders methodology and tool development. To address this gap, we performed a comprehensive investigation into the development and evolution of textual DSLs created with Xtext, a particularly widely used language workbench in the MDE. We systematically identified and analyzed 1002 GitHub repositories containing Xtext-related projects. A manual classification of the repositories brought forward 226 ones that contain a fully developed language. These were further categorized into 18 application domains, where we examined DSL artifacts and the availability of example instances. We explored DSL development practices, including development scenarios, evolution activities, and co-evolution of related artifacts. We observed that DSLs are used more, evolve faster, and are maintained longer in specific domains, such as Data Management and Databases. We identified DSL grammar definitions in 722 repositories, but only a third provided textual instances, with most utilizing over 60% of grammar rules. We found that most analyzed DSLs followed a grammar-driven approach, though some adopted a metamodel-driven approach. Additionally, we observed a trend of retrofitting existing languages in Xtext, demonstrating its flexibility beyond new DSL creation. We found that in most DSL development projects, updates to grammar definitions and example instances are very frequent, and most of the evolution activities can be classified as ``perfective'' changes. To support the research in the model-driven engineering community, we contribute a dataset of repositories with meta-information, helping to develop improved tools for DSL evolution.
Problem

Research questions and friction points this paper is trying to address.

Domain-Specific Languages (DSLs)
Evolution Processes
Model-Driven Engineering (MDE)
Innovation

Methods, ideas, or system contributions that make the work stand out.

Xtext-based DSLs
Evolution Analysis
Grammar-driven vs Metamodel-driven
🔎 Similar Papers
No similar papers found.