Non-Determinism in TensorFlow ResNets
Morin, Miguel, Willetts, Matthew
Commonly researchers need to run deep learning models repeatedlyto understand the variation in performance. The reruns will commonly be done using new seeds for the creation o f minibatches and for initialisation. We show here that for ResNets in TensorFlow Keras running on GPU s the variation caused by these sources of noise is dominated by that coming from the intrinsic non-d eterminism of GPUs themselves. While the existence of GPU non-determinism is well-known, the scale of its e ffect is perhaps less well understood - especially in the context of contemporary machine learning algorithm s. First we explain the source of this GPU-induced variability, across pa ckages and operating systems. Then we study the effects of GPU non-determinism on standard ResNet a rchitectures. To isolate the effect of GPU non-determinism we held constant the sources of randomness tha t effect the training of a ResNet other than GPU non-determinism: initial parameter weights and the ordering of training minibatches.
Jan-30-2020