Reviews: Variable Importance Using Decision Trees
–Neural Information Processing Systems
The article tackles the problem of variable importance in regression trees. The strategy is to select the variables based on the impurity reduction they induce on label Y. The main feature of this strategy is that the impurity reduction measure is based on the ordering of Y according to the ranking of the X variable under consideration, therefore it measures the relationship between Y and any variable in a more robust way than simple correlation would. The authors prove that this strategy is consistent (i.e. the true explanatory variables are selected) in a range of settings. This is then illustrated on a simulated example where the results displayed are somewhat the ones one could have expected: the proposed procedure is able to account for monotone but non linear relationships between X and Y so it yields better results than simple correlations.
Neural Information Processing Systems
Oct-7-2024, 22:01:14 GMT
- Technology: