Piper: Multidimensional Planner for DNN Parallelization

Neural Information Processing Systems 

In the "modern era", such model-parallel training techniques trace their roots back to AlexNet [