xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Beaglehole, Daniel, Holzmüller, David, Radhakrishnan, Adityanarayanan, Belkin, Mikhail

Aug-15-2025–arXiv.org Machine Learning

Tabular data - collections of continuous and categorical variables organized into matrices - underlies all aspects of modern commerce and science from airplane engines to biology labs to bagel shops. Yet, while Machine Learning and AI for language and vision have seen unprecedented progress, the primary methodologies of prediction from tabular data have been relatively static, dominated by variations of Gradient Boosted Decision Trees (GBDTs), such as XGBoost [7]. Nevertheless, hundreds of tabular datasets have been assembled to form extensive regression and classification benchmarks [11, 12, 16, 35, 37], and, recently, there has been renewed interest in building state-of-the-art predictive models for tabular data [15, 18, 19]. Notably, given the remarkable effectiveness of large, "foundation" models for text, there has been much excitement in developing similar models on tabular data, and recent effort has led to the development of TabPFN-v2, a foundation model for tabular data appearing in Nature [18]. Yet, despite this progress, tabular data still remains an active area for model development and building scalable, effective, and interpretable machine learning models in this domain remains an open challenge. In this work, we introduce xRFM, a tabular predictive model that combines recent advances in feature learning kernel machines with an adaptive tree structure, making it effective, scalable, and interpretable.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Machine Learning

Aug-15-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > San Diego County
    - San Diego (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.68)
- Transportation (0.66)
- Government > Regional Government
  - North America Government > United States Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Ensemble Learning (0.87)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found