Implicit Optimization Bias of Next-token Prediction in Linear Models

Open in new window