Distributed Training with Heterogeneous Data: Bridging Median-and Mean-Based Algorithms