r/MachineLearning - [D] [Machine Translation] Sources for the use of monolingual data in order to improve situations with already sufficient parallel data


Does anyone know of scientific literature that shows that, even in cases in which we have enough parallel data (English-French), use of monolingual data can be beneficial? To me it seems reasonable that if we, for instance, added monolingual data to the decoder, it would be better at scoring candidate predictions in terms of fluency. That being said, I cannot find peer-reviewed articles that show this.