🤖 AI Summary
This study investigates whether large language models (LLMs) genuinely explore novel program structures or fall into repetitive cycles when iteratively mutating code in the absence of selection pressure. By constructing LLM-driven mutation chains without selective constraints and analyzing program structure, detecting loops, and comparing against classical genetic programming’s subtree mutation as a baseline, the work reveals a strong structural convergence bias in LLM-based mutation: in 87% of mutation chains, over 93% of generated programs reuse existing structures, with variation largely confined to terminal symbol substitutions. Short cycles and self-loops are prevalent across diverse prompt designs, model families, and random replication conditions, demonstrating that LLMs intrinsically gravitate toward limited attractor regions—a behavior markedly distinct from traditional genetic programming.
📝 Abstract
When an LLM repeatedly mutates a program, does it explore new forms or circle back to the same ones? We study this question by analyzing LLM-driven mutation chains in the absence of selection pressure within a domain-specific language, varying prompt design, model family, and stochastic replication. We find that LLM-based mutation consistently converges toward restricted attractor regions in program space. Convergence is especially severe at the structural level: in 87% of chains, over 93% of mutations revisit a previously seen structural form, with most variation confined to terminal substitutions within recurring templates. Cycle analysis reveals short cycles and self-loops dominating the transition structure. The rate of convergence varies with prompt wording and model choice, but the phenomenon is robust across conditions. A classical GP subtree mutation operator does not exhibit comparable convergence, suggesting that the effect is intrinsic to the LLM mutation pipeline. These findings reveal a tension at the heart of LLM-driven program evolution: the same capabilities that enable semantics-aware program transformation also carry a systematic bias toward structural homogeneity that must be accounted for if such systems are to sustain open-ended exploration. Source code is available at https://github.com/can-gurkan/lmca.