Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness

Open in new window