Understanding Emergent Abilities of Language Models from the Loss Perspective

Neural Information Processing Systems 

In this paper, we propose to study emergent abilities in the lens of pre-training loss, instead of model size or training compute.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found