A Boo(n) for Evaluating Architecture Performance

Bajgar, Ondrej, Kadlec, Rudolf, Kleindienst, Jan

Jul-5-2018–arXiv.org Machine Learning

We point out important problems with the common practice of using the best single model performance for comparing deep learning architectures, and we propose a method that corrects these flaws. Each time a model is trained, one gets a different result due to random factors in the training process, which include random parameter initialization and random data shuffling. Reporting the best single model performance does not appropriately address this stochasticity. We propose a normalized expected best-out-of-$n$ performance ($\text{Boo}_n$) as a way to correct these problems.

artificial intelligence, machine learning, performance distribution, (18 more...)

arXiv.org Machine Learning

Jul-5-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Cambridge (0.04)
- Europe
  - Czechia > Prague (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)

Genre:
- Research Report
  - New Finding (0.68)
  - Experimental Study (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found