Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

Open in new window