Amazon researchers reduce data required for AI transfer learning

#artificialintelligence 

Cross-lingual learning is an AI technique involving training a natural language processing model in one language and retraining it in another. It's been demonstrated that retrained models can outperform those trained from scratch in the second language, which is likely why researchers at Amazon's Alexa division are investing considerable time investigating them. In a paper scheduled to be presented at this year's Conference on Empirical Methods in Natural Language Processing, two scientists at the Alexa AI natural understanding group -- Quynh Do and Judith Gaspers -- and colleagues propose a data selection technique that halves the amount of required training data. They claim that it surprisingly improves rather than compromises the model's overall performance in the target language. "Sometimes the data in the source language is so abundant that using all of it to train a transfer model would be impractically time consuming," wrote Do and Gaspers in a blog post.