Next-token pretraining implies in-context learning

Open in new window