LLM-Meta-SR: Learning to Evolve Selection Operators for Symbolic Regression

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In symbolic regression, selection operators are typically handcrafted, and existing LLM-driven evolutionary approaches suffer from code bloat and a lack of semantic guidance—hindering both interpretability and evolutionary efficiency. This paper proposes the first learning-based evolutionary framework that leverages large language models (LLMs) to automatically synthesize efficient and interpretable selection operators. We introduce two novel mechanisms: semantic-aware evaluation and code-bloat control, integrated with domain-knowledge-enhanced prompt engineering to jointly optimize operator generation for semantic validity and evolutionary efficacy. Evaluated on standard symbolic regression benchmarks, our automatically generated operators consistently outperform nine expert-designed baselines. To our knowledge, this is the first work demonstrating that LLMs can not only match but surpass human experts in algorithmic design—specifically, in crafting high-performing, semantically grounded selection operators for evolutionary symbolic regression.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have revolutionized algorithm development, yet their application in symbolic regression, where algorithms automatically discover symbolic expressions from data, remains constrained and is typically designed manually by human experts. In this paper, we propose a learning-to-evolve framework that enables LLMs to automatically design selection operators for evolutionary symbolic regression algorithms. We first identify two key limitations in existing LLM-based algorithm evolution techniques: code bloat and a lack of semantic guidance. Bloat results in unnecessarily complex components, and the absence of semantic awareness can lead to ineffective exchange of useful code components, both of which can reduce the interpretability of the designed algorithm or hinder evolutionary learning progress. To address these issues, we enhance the LLM-based evolution framework for meta symbolic regression with two key innovations: bloat control and a complementary, semantics-aware selection operator. Additionally, we embed domain knowledge into the prompt, enabling the LLM to generate more effective and contextually relevant selection operators. Our experimental results on symbolic regression benchmarks show that LLMs can devise selection operators that outperform nine expert-designed baselines, achieving state-of-the-art performance. This demonstrates that LLMs can exceed expert-level algorithm design for symbolic regression.
Problem

Research questions and friction points this paper is trying to address.

Automating selection operator design for symbolic regression using LLMs
Addressing code bloat and semantic guidance in LLM-based algorithm evolution
Enhancing interpretability and performance of evolutionary symbolic regression
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning-to-evolve framework for LLM-based selection operators
Bloat control and semantics-aware operator enhancement
Domain knowledge embedding in prompts for better performance
🔎 Similar Papers
No similar papers found.
Hengzhe Zhang
Hengzhe Zhang
Victoria University of Wellington
Genetic ProgrammingAutoML
Q
Qi Chen
Centre for Data Science and Artificial Intelligence, School of Engineering and Computer Science, Victoria University of Wellington, New Zealand
Bing Xue
Bing Xue
Meta Superintelligence Labs
LLMmachine learning for healthcarerepresentation learninggenerative models
M
Mengjie Zhang
Centre for Data Science and Artificial Intelligence, School of Engineering and Computer Science, Victoria University of Wellington, New Zealand