Diversely Stale Parameters for Efficient Training of CNNs