Decoding machine learning benchmarks