Machine Translation
Cross-Language Aphasia Detection using Optimal Transport Domain Adaptation
Balagopalan, Aparna, Novikova, Jekaterina, McDermott, Matthew B. A., Nestor, Bret, Naumann, Tristan, Ghassemi, Marzyeh
Multi-language speech datasets are scarce and often have small sample sizes in the medical domain. Robust transfer of linguistic features across languages could improve rates of early diagnosis and therapy for speakers of low-resource languages when detecting health conditions from speech. We utilize out-of-domain, unpaired, single-speaker, healthy speech data for training multiple Optimal Transport (OT) domain adaptation systems. We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1). Further, we show that adding aphasic data to the domain adaptation system significantly increases performance for both French and Mandarin, increasing the F1 scores further (10% and 8% increase in F1 scores for French and Mandarin, respectively, over unilingual baselines).
The Shallowness of Google Translate
One Sunday, at one of our weekly salsa sessions, my friend Frank brought along a Danish guest. I knew Frank spoke Danish well, since his mother was Danish, and he, as a child, had lived in Denmark. As for his friend, her English was fluent, as is standard for Scandinavians. However, to my surprise, during the evening's chitchat it emerged that the two friends habitually exchanged emails using Google Translate. Frank would write a message in English, then run it through Google Translate to produce a new text in Danish; conversely, she would write a message in Danish, then let Google Translate anglicize it.
Move over, Google Translate: Here come A.I. earbuds
Forget phrase books or even Google Translate. New translation devices are getting closer to replicating the fantasy of the Babel fish, which in the "Hitchhiker's Guide to the Galaxy" sits in one's ear and instantly translates any foreign language into the user's own. The WT2 Plus Ear to Ear AI Translator Earbuds from Timekettle are already available, while the over-the-ear "Ambassador" from Wavery Labs is scheduled for release this year. Both brands are wireless, and come with two earpieces that must be synced to a single smartphone connected to Wi-Fi or cellular data. These devices "bring us a bit closer to being able to travel to places in the world where people speak different languages and communicate smoothly with those who are living there," said Graham Neubig, an assistant professor at the Language Technologies Institute of Carnegie Mellon University and an expert in machine learning and natural language processing.
AWS adds 22 new languages to Amazon Translate ZDNet
Amazon Translate, Amazon Web Service's real-time translation service, is getting an update with support for 22 new languages. The announcement comes a week ahead of the AWS re:Invent conference, where AWS will promote Translate and a slew of other AI-powered tools for its cloud customers. AWS on Monday also announced new services related to image recognition, voice-based UIs and IOT. What is AI? Everything you need to know about Artificial Intelligence Amazon Translate now supports a total of 54 languages and dialects, with 2,804 language pairs now supported. The neural machine translation service enables customers to easily translate information from one language to many.
Samsung Research Centers Around the World Take First Place in Prestigious AI Challenges
Samsung Electronics' Global Research & Development (R&D) Centers play a key part in developing artificial intelligence (AI) capabilities for real-world usage. A credit to the work this advanced R&D branch of Samsung undertakes, both Samsung R&D Institute Poland and Samsung Research America AI Center have recently won two prestigious global challenges. This year, Samsung R&D Institute Poland won first place in two categories, the first being text-to-text translation from English to Czech and the second – an end-to-end system translating English speech into German text. For the text-to-text translation category, researchers worked to develop a model to translate the transcript of a spoken English-language TED Talk into Czech. Developing their winning model required the Samsung team to develop large, filtered corpora from which to work and generate as much synthetic data as possible.
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
Joty, Shafiq, Guzman, Francisco, Marquez, Lluis, Nakov, Preslav
We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference. We experiment with five transformations and augmentations of a base discourse tree representation based on the rhetorical structure theory, and we combine the kernel scores for each of them into a single score. Finally, we add other metrics from the ASIYA MT evaluation toolkit, and we tune the weights of the combination on actual human judgments. Experiments on the WMT12 and WMT13 metrics shared task datasets show correlation with human judgments that outperforms what the best systems that participated in these years achieved, both at the segment and at the system level.
22 New Languages And Variants, 6 New Regions For Amazon Translate Amazon Web Services
Just a few weeks ago, I told you about 7 new languages supported by Amazon Translate, our fully managed service for machine translation. Well, here I am again, announcing no less than 22 new languages and variants, as well as 6 additional AWS Regions where Translate is now available. Introducing 22 New Languages And Variants That's what I call an update! In addition to existing languages, Translate now supports: Afrikaans, Albanian, Amharic, Azerbaijani, Bengali, Bosnian, Bulgarian, Croatian, Dari, Estonian, Canadian French, Georgian, Hausa, Latvian, Pashto, Serbian, Slovak, Slovenian, Somali, Swahili, Tagalog, and Tamil. Congratulations if you can name all countries and regions of origin: I couldn't!
Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction
Guillou, Liane, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Webber, Bonnie, Popescu-Belis, Andrei
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemma-tised and PoS-tagged form. We provided four subtasks, for the English-French and English-German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English-French subtask, five for French-English, nine for English-German, and six for German-English. Most of the submissions outperformed two strong language-model- based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs.
City2City: Translating Place Representations across Cities
Yabe, Takahiro, Tsubouchi, Kota, Shimizu, Toru, Sekimoto, Yoshihide, Ukkusuri, Satish V.
Large mobility datasets collected from various sources have allowed us to observe, analyze, predict and solve a wide range of important urban challenges. In particular, studies have generated place representations (or embeddings) from mobility patterns in a similar manner to word embeddings to better understand the functionality of different places within a city. However, studies have been limited to generating such representations of cities in an individual manner and has lacked an inter-city perspective, which has made it difficult to transfer the insights gained from the place representations across different cities. In this study, we attempt to bridge this research gap by treating \textit{cities} and \textit{languages} analogously. We apply methods developed for unsupervised machine language translation tasks to translate place representations across different cities. Real world mobility data collected from mobile phone users in 2 cities in Japan are used to test our place representation translation methods. Translated place representations are validated using landuse data, and results show that our methods were able to accurately translate place representations from one city to another.
Amazon Translate gains 22 languages and 6 server regions
Early December marks the kickoff of Amazon's AWS re:Invent conference in Las Vegas, and ahead of the festivities the tech giant has unveiled a slew of product enhancements. To this end, Amazon Translate, the company's cloud machine translation service that delivers language translation via API requests, today gained new languages and variants and expanded to new regions globally. By way of a refresher, Translate -- which debuted in preview in November 2017 ahead of general availability last April -- taps AI that aims to deliver more accurate and natural-sounding translation than statistical or rule-based approaches. It allows customers to define how brand names, character names, model names, and other unique terms get translated. When used in tandem with a natural language processing app, Translate also facilitates sentiment analysis.