Goto

Collaborating Authors

 infinitesimal gradient


An RKHS Perspective on Tree Ensembles

arXiv.org Machine Learning

Random Forests and Gradient Boosting are among the most effective algorithms for supervised learning on tabular data. Both belong to the class of tree-based ensemble methods, where predictions are obtained by aggregating many randomized regression trees. In this paper, we develop a theoretical framework for analyzing such methods through Reproducing Kernel Hilbert Spaces (RKHSs) constructed on tree ensembles--more precisely, on the random partitions generated by randomized regression trees. We establish fundamental analytical properties of the resulting Random Forest kernel, including boundedness, continuity, and universality, and show that a Random Forest predictor can be characterized as the unique minimizer of a penalized empirical risk functional in this RKHS, providing a variational interpretation of ensemble learning. We further extend this perspective to the continuous-time formulation of Gradient Boosting introduced by Dombry and Duchamps (2024a,b), and demonstrate that it corresponds to a gradient flow on a Hilbert manifold induced by the Random Forest RKHS. A key feature of this framework is that both the kernel and the RKHS geometry are data-dependent, offering a theoretical explanation for the strong empirical performance of tree-based ensembles. Finally, we illustrate the practical potential of this approach by introducing a kernel principal component analysis built on the Random Forest kernel, which enhances the interpretability of ensemble models, as well as GVI, a new geometric variable importance criterion.


A large sample theory for infinitesimal gradient boosting

arXiv.org Artificial Intelligence

Infinitesimal gradient boosting (Dombry and Duchamps, 2021) is defined as the vanishing-learning-rate limit of the popular tree-based gradient boosting algorithm from machine learning. It is characterized as the solution of a nonlinear ordinary differential equation in a infinite-dimensional function space where the infinitesimal boosting operator driving the dynamics depends on the training sample. We consider the asymptotic behavior of the model in the large sample limit and prove its convergence to a deterministic process. This population limit is again characterized by a differential equation that depends on the population distribution. We explore some properties of this population limit: we prove that the dynamics makes the test error decrease and we consider its long time behavior.


Infinitesimal gradient boosting

arXiv.org Machine Learning

We define infinitesimal gradient boosting as a limit of the popular tree-based gradient boosting algorithm from machine learning. The limit is considered in the vanishing-learning-rate asymptotic, that is when the learning rate tends to zero and the number of gradient trees is rescaled accordingly. For this purpose, we introduce a new class of randomized regression trees bridging totally randomized trees and Extra Trees and using a softmax distribution for binary splitting. Our main result is the convergence of the associated stochastic algorithm and the characterization of the limiting procedure as the unique solution of a nonlinear ordinary differential equation in a infinite dimensional function space. Infinitesimal gradient boosting defines a smooth path in the space of continuous functions along which the training error decreases, the residuals remain centered and the total variation is well controlled.