Iterated Agent for Symbolic Regression

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Symbolic regression (SR) suffers from combinatorial explosion of the search space, severe overfitting, and poor model interpretability. This paper proposes a semantic-driven, LLM-augmented evolutionary framework in which large language models serve as semantic operators—guiding candidate expression generation and mutation via natural language rationales, thereby replacing traditional syntax-based, blind search. Evaluated on the FSReD benchmark, our method achieves noise-robust, high-accuracy modeling while significantly improving expression conciseness, physical interpretability, and mechanistic alignment. In a high-energy physics parameterization task, it discovers compact models with explicit physical meaning. The core innovation lies in the first deep integration of LLMs’ semantic reasoning capability into the evolutionary search loop, enabling a paradigm shift from “syntactic evolution” to “concept-driven scientific discovery.”

Technology Category

Application Category

📝 Abstract
Symbolic regression (SR), the automated discovery of mathematical expressions from data, is a cornerstone of scientific inquiry. However, it is often hindered by the combinatorial explosion of the search space and a tendency to overfit. Popular methods, rooted in genetic programming, explore this space syntactically, often yielding overly complex, uninterpretable models. This paper introduces IdeaSearchFitter, a framework that employs Large Language Models (LLMs) as semantic operators within an evolutionary search. By generating candidate expressions guided by natural-language rationales, our method biases discovery towards models that are not only accurate but also conceptually coherent and interpretable. We demonstrate IdeaSearchFitter's efficacy across diverse challenges: it achieves competitive, noise-robust performance on the Feynman Symbolic Regression Database (FSReD), outperforming several strong baselines; discovers mechanistically aligned models with good accuracy-complexity trade-offs on real-world data; and derives compact, physically-motivated parametrizations for Parton Distribution Functions in a frontier high-energy physics application. IdeaSearchFitter is a specialized module within our broader iterated agent framework, IdeaSearch, which is publicly available at https://www.ideasearch.cn/.
Problem

Research questions and friction points this paper is trying to address.

Automating mathematical expression discovery from data
Addressing combinatorial explosion and overfitting in symbolic regression
Generating interpretable models using semantic-guided evolutionary search
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses LLMs as semantic operators in evolutionary search
Generates expressions guided by natural-language rationales
Biases discovery toward interpretable and accurate models
Zhuo-Yang Song
Zhuo-Yang Song
Undergraduated Student of Physcis, Peking University
hep-phCs-CL
Zeyu Cai
Zeyu Cai
Institute of Heavy Ion Physics, Peking University
AI for SciencePlasma PhysicsAI AgentsNumber Theory
S
Shutao Zhang
School of Physics, Peking University, Beijing 100871, China
J
Jiashen Wei
School of Physics, Peking University, Beijing 100871, China
J
Jichen Pan
School of Physics, Peking University, Beijing 100871, China; Center for High Energy Physics, Peking University, Beijing 100871, China
S
Shi Qiu
School of Physics, Peking University, Beijing 100871, China; University of California, Berkeley
Qing-Hong Cao
Qing-Hong Cao
Peking University
high energy physics
T
Tie-Jiun Hou
School of Nuclear Science and Technology, University of South China, Hengyang, Hunan 421001, China
X
Xiaohui Liu
Center of Advanced Quantum Studies, School of Physics and Astronomy, Beijing Normal University, Beijing, 100875, China; Key Laboratory of Multi-scale Spin Physics, Ministry of Education, Beijing Normal University, Beijing 100875, China
M
Ming-xing Luo
Beijing Computational Science Research Center, Beijing 100193, China
Hua Xing Zhu
Hua Xing Zhu
Peking University
Quantum Field TheoryQCDEffective Field Theory