Goto

Collaborating Authors

Random Forests Weighted Local Fr\'echet Regression with Theoretical Guarantee

arXiv.org Machine Learning

Statistical analysis is increasingly confronted with complex data from general metric spaces, such as symmetric positive definite matrix-valued data and probability distribution functions. [47] and [17] establish a general paradigm of Fr\'echet regression with complex metric space valued responses and Euclidean predictors. However, their proposed local Fr\'echet regression approach involves nonparametric kernel smoothing and suffers from the curse of dimensionality. To address this issue, we in this paper propose a novel random forests weighted local Fr\'echet regression paradigm. The main mechanism of our approach relies on the adaptive kernels generated by random forests. Our first method utilizes these weights as the local average to solve the Fr\'echet mean, while the second method performs local linear Fr\'echet regression, making both methods locally adaptive. Our proposals significantly improve existing Fr\'echet regression methods. Based on the theory of infinite order U-processes and infinite order Mmn-estimator, we establish the consistency, rate of convergence, and asymptotic normality for our proposed random forests weighted Fr\'echet regression estimator, which covers the current large sample theory of random forests with Euclidean responses as a special case. Numerical studies show the superiority of our proposed two methods for Fr\'echet regression with several commonly encountered types of responses such as probability distribution functions, symmetric positive definite matrices, and sphere data. The practical merits of our proposals are also demonstrated through the application to the human mortality distribution data.


Approximate Fr\'echet Mean for Data Sets of Sparse Graphs

arXiv.org Machine Learning

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Fr\'echet mean. In this work, we equip a set of graph with the pseudometric defined by the $\ell_2$ norm between the eigenvalues of their respective adjacency matrix . Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems on sets of graphs. We describe an algorithm to compute an approximation to the Fr\'echet mean of a set of undirected unweighted graphs with a fixed size.


A selective review of sufficient dimension reduction for multivariate response regression

arXiv.org Machine Learning

We review sufficient dimension reduction (SDR) estimators with multivariate response in this paper. A wide range of SDR methods are characterized as inverse regression SDR estimators or forward regression SDR estimators. The inverse regression family include pooled marginal estimators, projective resampling estimators, and distance-based estimators. Ordinary least squares, partial least squares, and semiparametric SDR estimators, on the other hand, are discussed as estimators from the forward regression family.


Statistical monitoring of functional data using the notion of Fr\'echet mean combined with the framework of the deformation models

arXiv.org Machine Learning

The aim of this paper is to investigate possible advances obtained by the implementation of the framework of Fr\'echet mean and the generalized sense of mean that it offers, in the field of statistical process monitoring and control. In particular, the case of non-linear profiles which are described by data in functional form is considered and a framework combining the notion of Fr\'echet mean and deformation models is developed. The proposed monitoring approach is implemented to the intra-day air pollution monitoring task in the city of Athens where the capabilities and advantages of the method are illustrated.


Theoretical analysis and computation of the sample Frechet mean for sets of large graphs based on spectral information

arXiv.org Machine Learning

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Frechet mean. In this work, we equip a set of graphs with the pseudometric defined by the norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Frechet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric.