Regress, Don't Guess -- A Regression-like Loss on Number Tokens for Language Models

Open in new window