Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Domain-specific language (DSL) syntax evolution in textual domains often renders existing instances—including annotations, formatting, and layout information—obsolete and prone to loss. Method: This paper pioneers a large language model (LLM)-driven co-evolution approach for DSL syntax and corresponding textual instances. Leveraging Claude-3.5 and GPT-4o, we perform end-to-end migration of both syntax definitions and concrete instances across seven DSLs, jointly preserving semantic correctness and auxiliary information fidelity. Results: Experiments demonstrate that LLMs effectively maintain structural integrity, readability, and formatting consistency for small-scale instances; however, significant scalability bottlenecks emerge in large-scale scenarios. This work establishes a novel paradigm for evolving unstructured-text DSLs—where traditional model-driven approaches fall short—and extends the frontier of LLM applications in software engineering toward syntax-aware, grammar-preserving evolution.

Technology Category

Application Category

📝 Abstract

Software languages evolve over time for various reasons, such as the addition of new features. When the language's grammar definition evolves, textual instances that originally conformed to the grammar become outdated. For DSLs in a model-driven engineering context, there exists a plethora of techniques to co-evolve models with the evolving metamodel. However, these techniques are not geared to support DSLs with a textual syntax -- applying them to textual language definitions and instances may lead to the loss of information from the original instances, such as comments and layout information, which are valuable for software comprehension and maintenance. This study explores the potential of Large Language Model (LLM)-based solutions in achieving grammar and instance co-evolution, with attention to their ability to preserve auxiliary information when directly processing textual instances. By applying two advanced language models, Claude-3.5 and GPT-4o, and conducting experiments across seven case languages, we evaluated the feasibility and limitations of this approach. Our results indicate a good ability of the considered LLMs for migrating textual instances in small-scale cases with limited instance size, which are representative of a subset of cases encountered in practice. In addition, we observe significant challenges with the scalability of LLM-based solutions to larger instances, leading to insights that are useful for informing future research.

Problem

Research questions and friction points this paper is trying to address.

Co-evolve textual DSL definitions and instances

Preserve auxiliary information like comments and layout

Assess LLM scalability for large instance migration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using LLMs to co-evolve textual DSL definitions and instances

Preserving comments and layout information during migration

Evaluating feasibility with Claude-3.5 and GPT-4o on seven languages

🔎 Similar Papers

Learning Evolving Tools for Large Language Models