Transformers are almost optimal metalearners for linear classification

Open in new window