[R] [1707.02968] Revisiting Unreasonable Effectiveness of Data in Deep Learning Era • r/MachineLearning
The improvement chart looks nice. And they note the slope is probably steeper than it looks because they didn't train the models to convergence nor did a hyperparameter search. But on the other hand, that in a way answers their question. This paper used 50 K80 GPUs for 2 months and they still couldn't train a 101-layer Resnet model to convergence, much less do hyperparameter search or experiment with the 1000-layer Resnets or Densenets or attention or all the other fun things you can do with cutting edge CNNs. If a Google/CMU team with that much computational resources can't make good use of 300M images, why does anyone anywhere need that dataset?
Jul-11-2017, 03:55:12 GMT
- Technology: