How to train a new language model from scratch using Transformers and Tokenizers
Over the past few weeks, we made several improvements to our transformers and tokenizers libraries, with the goal of making it way easier to train a new language model from scratch. In this post we'll demo how to train a "small" model (84 M parameters 6 layers, 768 hidden size, 12 attention heads) – that's the same number of layers & heads as DistilBERT – on Esperanto. Esperanto is a constructed language with a goal of being easy to learn. You won't need to understand Esperanto to understand this post, but if you do want to learn it, Duolingo has a nice course with 280k active learners. First, let us find a corpus of text in Esperanto.
Mar-5-2020, 00:43:23 GMT
- Technology: