Runtime Concurrency Control and Operation Scheduling for High Performance Neural Network Training