Goto

Collaborating Authors

 Machine Translation



MultimodalandMultilingualEmbeddings forLarge-ScaleSpeechMining

Neural Information Processing Systems

Using a similarity metric in that multimodal embedding space, we perform mining of audio in German, French, Spanish and English from Librivox against billions of sentences from CommonCrawl.







Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Neural Information Processing Systems

Finally, we describe the training setup for our back-translation experiments. We continue to differentiate our method from other existing works. Our method does not train multiple peer models with EM training either. In each round, a forward (or backward) model takes turn to play the "back-translation" role to train The role is switched in the next round. In other words, source and target are identical.


DataDiversification: ASimpleStrategyForNeural MachineTranslation

Neural Information Processing Systems

Our method is applicable to all NMT models. It does not require extra monolingual data like back-translation, nor does it add more computations and parameters like ensembles ofmodels.