Benchmarking Training Time for CNN-based Detectors with Apache MXNet Amazon Web Services
The expected reduction in training time when batch size increases is obvious. However, why is there a drastic increase in training time when we increase batch size in Caffe?! Let's have a look at nvidia-smi for both of the Caffe experiments:
Aug-30-2017, 05:40:17 GMT