Meet in the Middle: A New Pre-training Paradigm

Neural Information Processing Systems 

Most language models (LMs) are trained and applied in an autoregressive left-to-right fashion, predicting the next token from the preceding ones. However, this ignores that the full sequence is available during training.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found