A Nonparametric Test of Dependence Based on Ensemble of Decision Trees

Jul-23-2020–arXiv.org Machine Learning

A general purpose method to detect statistical dependence, or correlation, between random variables has invaluable uses in a wide array of sciences and applications (Li, 2000; Martínez-Gómez et al., 2014; Mahdi et al., 2012). Linear correlation (Pearson, 1920) is one of the oldest statistical methods that are still widely used today. Though the assumption of linearity is not always realistic, the popularity of such method stems from its ease of computation, simplicity, interpretability, and high power when the assumption of linearity is satisfied. Several approaches have been proposed to quantify correlation, in the general case, for more complex relationships and under less stringent assumptions. Examples of these methods are the kernel based correlation (Hardoon et al., 2004; Chang et al., 2013), copula methods (Poczos et al., 2012), distance correlation (Székely et al., 2007; Székely and Rizzo, 2009), and discretization based mutual information (MI) (Steuer et al., 2002) methods such as the maximal information criterion (MIC) (Reshef et al., 2011). Issues that can be lacking in some of the existing methods include: low statistical power, high computation demand, lack of intuitive interpretability, or lack of a known distribution of the coefficient under independence that would enable computing a statistical confidence. More thorough details on the pros and cons of those methods and others can be found in several studies (de Siqueira Santos et al., 2014; N. Reshef et al., 2018).

artificial intelligence, correlation, machine learning, (18 more...)

arXiv.org Machine Learning

Jul-23-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > New York > New York County > New York City (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Decision Tree Learning (0.66)
  - Representation & Reasoning > Diagnosis (0.42)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found