Transformer Semantic Genetic Programming for d-dimensional Symbolic Regression Problems
Anthes, Philipp, Sobania, Dominik, Rothlauf, Franz
–arXiv.org Artificial Intelligence
Transformer Semantic Genetic Programming (TSGP) is a semantic search approach that uses a pre-trained transformer model as a variation operator to generate offspring programs with controlled semantic similarity to a given parent. Unlike other semantic GP approaches that rely on fixed syntactic transformations, TSGP aims to learn diverse structural variations that lead to solutions with similar semantics. We find that a single transformer model trained on millions of programs is able to generalize across symbolic regression problems of varying dimension. Evaluated on 24 real-world and synthetic datasets, TSGP significantly outperforms standard GP, SLIM_GSGP, Deep Symbolic Regression, and Denoising Autoencoder GP, achieving an average rank of 1.58 across all benchmarks. Moreover, TSGP produces more compact solutions than SLIM_GSGP, despite its higher accuracy. In addition, the target semantic distance $\mathrm{SD}_t$ is able to control the step size in the semantic space: small values of $\mathrm{SD}_t$ enable consistent improvement in fitness but often lead to larger programs, while larger values promote faster convergence and compactness. Thus, $\mathrm{SD}_t$ provides an effective mechanism for balancing exploration and exploitation.
arXiv.org Artificial Intelligence
Nov-13-2025
- Country:
- Europe
- Germany > Rheinland-Pfalz
- Mainz (0.05)
- Middle East > Cyprus
- Spain > Andalusia
- Granada Province > Granada (0.04)
- Germany > Rheinland-Pfalz
- North America > Mexico (0.04)
- South America > Chile
- Europe
- Genre:
- Research Report (1.00)
- Technology: