Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging