Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation

Yusuf, Mirza, Surana, Praatibh, Gupta, Gauri, Ramesh, Krithika

arXiv.org Artificial Intelligence 

Although our computational techniques and hardware resources have advanced greatly these past few decades, given the rise of large language models which have applications in multiple sectors, the environmental impact of training and developing NLP models, particularly on a large scale, could have detrimental consequences on the environment. This is due to the energy usage (whether carbon neutral or not) [1, 2] possibly contributing directly or indirectly to the effects of climate change. With experiments on total time expected for models such as Transformer, BERT, and GPT-2 to train and the subsequent cost of training, Strubell et al. [2] provides substantial evidence that researchers need to increasingly prioritize computationally efficient hardware and algorithms. There has been research to suggest that large language models could be outperformed by their less computationally intensive counterparts on multiple tasks with the help of fine-tuning [3] and techniques such as using random search for hyperparameter search [1, 4-6] or pruning [7, 8]. Additionally, as performance across different tasks tends to vary based on the languages used, data availability, model architectures among other factors, it is likely that training models to a certain performance level for some languages are less carbon-intensive than others. This is speculation is substantiated by the correlation found between morphological ambiguity of languages and the performance of language models on European languages [9]. The primary objective of our work is to measure the differences in carbon emissions released between multiple language pairs and assess the contributions of various components, within the two architectures we've used, to the carbon We are grateful to the Research Society MIT, Manipal for supporting this work, and we attribute equal contribution to all the authors of this paper.