Grönroos, Stig-Arne
MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki
Mickus, Timothee, Grönroos, Stig-Arne, Attieh, Joseph, Boggia, Michele, De Gibert, Ona, Ji, Shaoxiong, Lopi, Niki Andreas, Raganato, Alessandro, Vázquez, Raúl, Tiedemann, Jörg
NLP in the age of monolithic large language models is approaching its limits in terms of size and information that can be handled. The trend goes to modularization, a necessary step into the direction of designing smaller sub-networks and components with specialized functionality. In this paper, we present the MAMMOTH toolkit: a framework designed for training massively multilingual modular machine translation systems at scale, initially derived from OpenNMT-py and then adapted to ensure efficient training across computation clusters. We showcase its efficiency across clusters of A100 and V100 NVIDIA GPUs, and discuss our design philosophy and plans for future information. The toolkit is publicly available online.
Isotropy, Clusters, and Classifiers
Mickus, Timothee, Grönroos, Stig-Arne, Attieh, Joseph
Whether embedding spaces use all their dimensions equally, i.e., whether they are isotropic, has been a recent subject of discussion. Evidence has been accrued both for and against enforcing isotropy in embedding spaces. In the present paper, we stress that isotropy imposes requirements on the embedding space that are not compatible with the presence of clusters -- which also negatively impacts linear classification objectives. We demonstrate this fact empirically and use it to shed light on previous results from the literature.
Democratizing Neural Machine Translation with OPUS-MT
Tiedemann, Jörg, Aulamo, Mikko, Bakshandaeva, Daria, Boggia, Michele, Grönroos, Stig-Arne, Nieminen, Tommi, Raganato, Alessandro, Scherrer, Yves, Vazquez, Raul, Virpioja, Sami
Language technology carries a growing responsibility in a society that is increasingly dominated by digital communication channels. Machine translation (MT) plays a decisive role in cross-lingual information access and will continue to grow as a crucial component in our natural language processing (NLP) toolbox, enabling inclusiveness and equity among people with different cultural and linguistic backgrounds. All the major IT companies recognize the importance of MT and push significant efforts into the development of internal translation solutions with slogans like "no language left behind"
Silo NLP's Participation at WAT2022
Parida, Shantipriya, Panda, Subhadarshi, Grönroos, Stig-Arne, Granroth-Wilding, Mark, Koistinen, Mika
This paper provides the system description of "Silo NLP's" submission to the Workshop on Asian Translation (WAT2022). We have participated in the Indic Multimodal tasks (English->Hindi, English->Malayalam, and English->Bengali Multimodal Translation). For text-only translation, we trained Transformers from scratch and fine-tuned mBART-50 models. For multimodal translation, we used the same mBART architecture and extracted object tags from the images to use as visual features concatenated with the text sequence. Our submission tops many tasks including English->Hindi multimodal translation (evaluation test), English->Malayalam text-only and multimodal translation (evaluation test), English->Bengali multimodal translation (challenge test), and English->Bengali text-only translation (evaluation test).