The life of a dataset in machine learning research – interview with Bernard Koch
Bernard Koch, Emily Denton, Alex Hanna and Jacob Foster won a best paper award, for Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research, in the datasets and benchmarks track at NeurIPS 2021. Here, Bernard tells us about the advantages and disadvantages of benchmarking, the findings of their paper, and plans for future work. Machine learning is a rather unusual science, partly because it straddles the space between science and engineering. The main way that progress is evaluated is through state-of-the-art benchmarking. The scientific community agrees on a shared problem, they pick a dataset which they think is representative of the data that you might see when you try to solve that problem in the real world, then they compare their algorithms on a score for that dataset.
Feb-17-2022, 15:04:23 GMT