AITopics | Verdinelli, Isabella

Collaborating Authors

Verdinelli, Isabella

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature Importance: A Closer Look at Shapley Values and LOCO

Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningMar-10-2023

There is much interest lately in explainability in statistics and machine learning. One aspect of explainability is to quantify the importance of various features (or covariates). Two popular methods for defining variable importance are LOCO (Leave Out COvariates) and Shapley Values. We take a look at the properties of these methods and their advantages and disadvantages. We are particularly interested in the effect of correlation between features which can obscure interpretability. Contrary to some claims, Shapley values do not eliminate feature correlation. We critique the game theoretic axioms for Shapley values and suggest some new axioms. We propose new, more statistically oriented axioms for feature importance and some measures that satisfy these axioms. However, correcting for correlation is a Faustian bargain: removing the effect of correlation creates other forms of bias. Ultimately, we recommend a slightly modified version of LOCO. We briefly consider how to modify Shapley values to better address feature correlation.

artificial intelligence, machine learning, shapley value, (15 more...)

arXiv.org Machine Learning

2303.05981

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Decorrelated Variable Importance

Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningNov-21-2021

Because of the widespread use of black box prediction methods such as random forests and neural nets, there is renewed interest in developing methods for quantifying variable importance as part of the broader goal of interpretable prediction. A popular approach is to define a variable importance parameter - known as LOCO (Leave Out COvariates) - based on dropping covariates from a regression model. This is essentially a nonparametric version of R-squared. This parameter is very general and can be estimated nonparametrically, but it can be hard to interpret because it is affected by correlation between covariates. We propose a method for mitigating the effect of correlation by defining a modified version of LOCO. This new parameter is difficult to estimate nonparametrically, but we show how to estimate it using semiparametric models.

artificial intelligence, influence function, machine learning, (17 more...)

arXiv.org Machine Learning

2111.10853

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Forest Guided Smoothing

Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningMar-8-2021

Random forests are often an accurate method for nonparametric regression but they are notoriously difficult to interpret. Also, it is difficult to construct standard errors, confidence intervals and meaningful measures of variable importance. In this paper, we construct a spatially adaptive local linear smoother that approximates the forest. Our approach builds on the ideas in Bloniarz et al. (2016) and Friedberg et al. (2020). The main difference is that we define a one parameter family of bandwidth matrices which help with the construction of confidence intervals, and measures of variable importance. Our starting point is the well-known fact that a random forest can be regarded as a type of kernel smoother (Breiman (2000); Scornet (2016); Lin and Jeon (2006); Geurts et al. (2006); Hothorn et al. (2004); Meinshausen (2006)). We take it as a given that the forest is an accurate predictor and we do not make any attempt to improve the method. Instead, we want to find a family of linear smoothers that approximate the forest. Then we show how to use this family for interpretation, bias correction, confidence intervals, variable importance and for exploring the structure of the forest.

artificial intelligence, bandwidth matrix, health & medicine, (17 more...)

arXiv.org Machine Learning

2103.05092

Country: North America > United States > New York (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Finding Singular Features

Genovese, Christopher, Perone-Pacifico, Marco, Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningJun-1-2016

We present a method for finding high density, low-dimensional structures in noisy point clouds. These structures are sets with zero Lebesgue measure with respect to the $D$-dimensional ambient space and belong to a $d

artificial intelligence, machine learning, singular feature, (18 more...)

arXiv.org Machine Learning

1606.00265

Country: Europe > Italy (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

Nonparametric ridge estimation

Genovese, Christopher R., Perone-Pacifico, Marco, Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningAug-28-2014

We study the problem of estimating the ridges of a density function. Ridge estimation is an extension of mode finding and is useful for understanding the structure of a density. It can also be used to find hidden structure in point cloud data. We show that, under mild regularity conditions, the ridges of the kernel density estimator consistently estimate the ridges of the true density. When the data are noisy measurements of a manifold, we show that the ridges are close and topologically similar to the hidden manifold. To find the estimated ridges in practice, we adapt the modified mean-shift algorithm proposed by Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical experiments verify that the algorithm is accurate.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1214/14-AOS1218

1212.5156

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Manifold estimation and singular deconvolution under Hausdorff loss

Genovese, Christopher R., Perone-Pacifico, Marco, Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningJun-5-2012

We find lower and upper bounds for the risk of estimating a manifold in Hausdorff distance under several models. We also show that there are close connections between manifold estimation and the problem of deconvolving a singular measure.

artificial intelligence, machine learning, manifold, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/12-AOS994

1109.454

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Minimax Manifold Estimation

Genovese, Christopher, Perone-Pacifico, Marco, Verdinelli, Isabella, Wasserman, Larry

arXiv.org Machine LearningSep-28-2011

We find the minimax rate of convergence in Hausdorff distance for estimating a manifold M of dimension d embedded in R^D given a noisy sample from the manifold. We assume that the manifold satisfies a smoothness condition and that the noise distribution has compact support. We show that the optimal rate of convergence is n^{-2/(2+d)}. Thus, the minimax rate depends only on the dimension of the manifold, not on the dimension of the space in which M is embedded.

artificial intelligence, machine learning, manifold, (16 more...)

arXiv.org Machine Learning

1007.0549

Country:

Europe (0.46)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)

Add feedback