Scaling Up Data Parallelism in Decentralized Deep Learning