Provable Long-Range Benefits of Next-Token Prediction

Open in new window