Multi-Language Benchmark Generation via L-Systems

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

Existing benchmark suites often lack structural complexity, semantic validity, and cross-language coverage, limiting their utility for systematic performance evaluation of compilers, runtimes, and hardware—particularly beyond bug detection. Method: This paper introduces a formal L-system–based methodology for generating large-scale, semantically correct, structurally intricate artificial benchmarks across C, C++, Julia, and Go. It adapts biological L-system rewriting mechanisms to program generation, integrating multi-language grammar embeddings, iterative production rules, and compiler behavior modeling to ensure syntactic well-formedness, scalability, and semantic legality. Contribution/Results: The resulting PGO experimental framework enables six in-depth case studies, uncovering previously undocumented phenomena—including Clang/GCC performance divergence, language-ecosystem boundary effects, GCC’s historical optimization trends, asymptotic phase-wise behavior in Clang compilation, and empirical performance characteristics of GLib data structures. This work establishes a novel paradigm for system-level, architecture-agnostic performance assessment.

Technology Category

Application Category

📝 Abstract

L-systems are a mathematical formalism proposed by biologist Aristid Lindenmayer with the aim of simulating organic structures such as trees, snowflakes, flowers, and other branching phenomena. They are implemented as a formal language that defines how patterns can be iteratively rewritten. This paper describes how such a formalism can be used to create artificial programs written in programming languages such as C, C++, Julia and Go. These programs, being large and complex, can be used to test the performance of compilers, operating systems, and computer architectures. This paper demonstrates the usefulness of these benchmarks through multiple case studies. These case studies include a comparison between clang and gcc; a comparison between C, C++, Julia and Go; a study of the historical evolution of gcc in terms of code quality; a look into the effects of profile guided optimizations in gcc; an analysis of the asymptotic behavior of the different phases of clang's compilation pipeline; and a comparison between the many data structures available in the Gnome Library (GLib). These case studies demonstrate the benefits of the L-System approach to create benchmarks, when compared with fuzzers such as CSmith, which were designed to uncover bugs in compilers, rather than evaluating their performance.

Problem

Research questions and friction points this paper is trying to address.

Generating large, complex programs in multiple languages for performance testing

Comparing compiler and language performance through systematic benchmark creation

Evaluating optimization techniques and historical evolution of compilation tools

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using L-systems to generate multi-language artificial programs

Creating large, complex benchmarks for compiler and system evaluation

Applying L-systems for performance testing across programming languages

🔎 Similar Papers

No similar papers found.