A Too-Clever Ranking Method

AI Magazine 

I developed what I thought was an extremely clever method for detecting "bad" training instances. Each instance was scored, and those with the lowest scores could be removed before running C4.5 to build a decision tree with the remainder. I ran an experiment in which I removed the bottom 10 percent of the instances in a University of California, Irvine (UCI) data set. The resulting tree was smaller and more accurate (as measured by 10-fold CV) than the tree built on the full data set. Then I removed the bottom 20 percent of the instances and got a tree that was smaller than the last one and just as accurate.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found