DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining