The Simple Math behind 3 Decision Tree Splitting criterions
Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. In simple terms, Gini impurity is the measure of impurity in a node. So to understand the formula a little better, let us talk specifically about the binary case where we have nodes with only two classes. So in the below five examples of candidate nodes labelled A-E and with the distribution of positive and negative class shown, which is the ideal condition to be in? I reckon you would say A or E and you are right.
Oct-6-2019, 02:44:12 GMT
- Technology: