Pushing the limits of GPU performance with XLA – TensorFlow – Medium
XLA is a compiler for TensorFlow graphs that you can use to accelerate your TensorFlow ML models today with minimal source code changes. This post describes what XLA is and shows how you can try it out on your own code. TensorFlow 1.12 (with XLA) achieves significant performance gains over TF 1.11 (without XLA) on ResNet50 v1.0 training on NVIDIA Tesla V100 GPUs: 10,526 images/sec with synthetic data and 10,267 images/sec with real data (see appendix for reproduction instructions). We have observed speedups ranging from 1.13x to 3.04x on a variety of internal models. Normally when you run a TensorFlow graph, all of the operations are executed individually by the TensorFlow graph executor.
Nov-16-2018, 23:49:51 GMT