A New Massive Multilingual Dataset for High-Performance Language Technologies