Understanding LazyTensor System Performance with PyTorch/XLA on Cloud TPU