Understanding Gradient Orthogonalization for Deep Learning via Non-Euclidean Trust-Region Optimization

Open in new window