Rethinking benchmark systems for machine learning
Common methods applied in the evaluation of model performance share several limitations. There are many approaches to verify whether a new algorithm improves the performance compared to the previous state-of-the-art algorithms. The majority of them are testing procedures. In his paper Statistical Comparisons of Classifiers over Multiple Data Sets, Janez Demšar reviewed commonly used practices and pointed out the vast amount of problems with them. He analyzed papers from five International Conferences on Machine Learning (1999-2003) that compared at least two classification models.
Sep-3-2020, 08:56:51 GMT
- Technology: