On the Training Convergence of Transformers for In-Context Classification

Open in new window