Improved Convergence in High Probability of Clipped Gradient Methods with Heavy Tailed Noise

Neural Information Processing Systems 

In this work, we consider the setting of heavy-tailed noise proposed by Zhang et al., (2020) [32],