Implicit Bias and Fast Convergence Rates for Self-attention

Open in new window