The BigScience ROOTS Corpus: A1.6TB Composite Multilingual Dataset

Open in new window