Data-Efficient Symbolic Regression via Foundation Model Distillation
Ying, Wangyang, Zhang, Jinghan, Bai, Haoyue, Gong, Nanxu, Wang, Xinyuan, Liu, Kunpeng, Reddy, Chandan K., Fu, Yanjie
–arXiv.org Artificial Intelligence
Discovering interpretable mathematical equations from observed data (a.k.a. equation discovery or symbolic regression) is a cornerstone of scientific discovery, enabling transparent modeling of physical, biological, and economic systems. While foundation models pre-trained on large-scale equation datasets offer a promising starting point, they often suffer from negative transfer and poor generalization when applied to small, domain-specific datasets. In this paper, we introduce EQUATE (Equation Generation via QUality-Aligned Transfer Embeddings), a data-efficient fine-tuning framework that adapts foundation models for symbolic equation discovery in low-data regimes via distillation. EQUATE combines symbolic-numeric alignment with evaluator-guided embedding optimization, enabling a principled embedding-search-generation paradigm. Our approach reformulates discrete equation search as a continuous optimization task in a shared embedding space, guided by data-equation fitness and simplicity. Experiments across three standard public benchmarks (Feynman, Strogatz, and black-box datasets) demonstrate that EQUATE consistently outperforms state-of-the-art baselines in both accuracy and robustness, while preserving low complexity and fast inference. These results highlight EQUATE as a practical and generalizable solution for data-efficient symbolic regression in foundation model distillation settings.
arXiv.org Artificial Intelligence
Aug-28-2025
- Country:
- North America
- Mexico > Quintana Roo
- Cancún (0.04)
- United States
- Arizona > Maricopa County
- Tempe (0.04)
- New York > New York County
- New York City (0.04)
- South Carolina (0.04)
- Virginia > Montgomery County
- Blacksburg (0.04)
- Arizona > Maricopa County
- Mexico > Quintana Roo
- South America > Chile
- North America
- Genre:
- Research Report (1.00)
- Industry:
- Banking & Finance > Economy (0.34)
- Technology: