CCMatrix: A billion-scale bitext data set for training translation models

Open in new window