GP-GOMEA with GPU-Based Fitness Evaluations: Design and Performance Analysis

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

260K/year
🤖 AI Summary
This work addresses the scalability limitations of GP-GOMEA in symbolic regression, which stem from its high computational cost when applied to large-scale datasets and complex expressions. The study presents the first GPU-accelerated implementation of GP-GOMEA, introducing a GPU-friendly templated individual representation and a parallel fitness evaluation strategy. Combined with a population-level parallel search mechanism, this approach substantially increases evaluation throughput. Empirical results demonstrate that the proposed method significantly outperforms existing approaches on four standard benchmarks, reliably rediscovering the largest Feynman equations within four hours—a milestone not previously achieved. Furthermore, the work provides insights into how expression structure influences search difficulty, shedding light on the underlying mechanisms governing problem hardness in symbolic regression.
📝 Abstract
GP-GOMEA is a state-of-the-art evolutionary algorithm for symbolic regression, known for discovering small and interpretable models. However, its computational cost remains substantial, limiting its applicability to larger datasets and more complex target expressions. In contrast, the rise of modern subsymbolic approaches, particularly deep learning, has been driven largely by the massive parallelism offered by GPUs. In this work, we take the first major step toward a fully GPU-accelerated GP-GOMEA by introducing a GPU-based fitness evaluation scheme. We design a GPU-friendly representation of GP-GOMEA's template-based individuals and a corresponding evaluation strategy that exploits the inherent parallelism of population-based search. This substantially increases evaluation throughput, enabling orders of magnitude more evaluations within the same time budget. Across four standard symbolic regression benchmarks, this increased evaluation capacity yields performance improvements, particularly for larger datasets and larger population sizes. Moreover, the ability to efficiently evaluate much larger datasets and more complex templates enables analyses that were previously infeasible, allowing us to systematically analyze what makes expressions increasingly difficult for GP-GOMEA, providing new insights into how expression structure affects search difficulty. Finally, for the first time, this expanded capability allows a problem-agnostic evolutionary algorithm to reliably regress one of the largest Feynman equations within four hours.
Problem

Research questions and friction points this paper is trying to address.

symbolic regression
computational cost
GP-GOMEA
large datasets
complex expressions
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU acceleration
symbolic regression
GP-GOMEA
fitness evaluation
evolutionary algorithm
🔎 Similar Papers
No similar papers found.