Machine Translation
Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition
Panchbhai, Anand, Soru, Tommaso, Marx, Edgard
A booming amount of information is continuously added to the Internet as structured and unstructured data, feeding knowledge bases such as DBpedia and Wikidata with billions of statements describing millions of entities. The aim of Question Answering systems is to allow lay users to access such data using natural language without needing to write formal queries. However, users often submit questions that are complex and require a certain level of abstraction and reasoning to decompose them into basic graph patterns. In this short paper, we explore the use of architectures based on Neural Machine Translation called Neural SPARQL Machines to learn pattern compositions. We show that sequence-to-sequence models are a viable and promising option to transform long utterances into complex SPARQL queries.
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Zheng, Renjie, Ma, Mingbo, Zheng, Baigong, Liu, Kaibo, Yuan, Jiahong, Church, Kenneth, Huang, Liang
Simultaneous speech-to-speech translation is widely useful but extremely challenging, since it needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay. In addition, it needs to continuously translate a stream of sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches accumulate latencies progressively when the speaker talks faster, and introduce unnatural pauses when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation (SAT) which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech (as measured by the naturalness metric MOS) with substantially lower latency than the baseline, in both Zh <-> En directions.
The first AI model that translates 100 languages without relying on English data
Facebook AI is introducing, M2M-100 the first multilingual machine translation (MMT) model that translates between any pair of 100 languages without relying on English data. When translating, say, Chinese to French, previous best multilingual models train on Chinese to English and English to French, because English training data is the most widely available. Our model directly trains on Chinese to French data to better preserve meaning. It outperforms English-centric systems by 10 points on the widely used BLEU metric for evaluating machine translations. M2M-100 is trained on a total of 2,200 language directions -- or 10x more than previous best, English-centric multilingual models.
Facebook's new AI can translate languages directly into one another
Whether you're logging on from the US, Brazil, Borneo, or France, Facebook can translate virtually any written content published on its platform into the local language using automated machine translation. In fact, Facebook provides around 20 billion translations everyday for its News Feed alone. However these systems typically use English as an intermediary step -- that is, translating from Chinese to French actually goes Chinese to English to French. This is done because data sets of translations to and from English are massive and widely available but putting English in the middle reduces the overall translation accuracy while making the entire process more complex and cumbersome than it needs to be. That's why Facebook AI has developed a new MT model that can bidirectionally translate directly between two languages (Chinese to French and French to Chinese) without ever using English as a crutch -- and which outperforms the English-centric model by 10 points on BLEU metrics.
Bayesian Attention Modules
Fan, Xinjie, Zhang, Shujian, Chen, Bo, Zhou, Mingyuan
Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability. Most current models use deterministic attention modules due to their simplicity and ease of optimization. Stochastic counterparts, on the other hand, are less popular despite their potential benefits. The main reason is that stochastic attention often introduces optimization issues or requires significant model changes. In this paper, we propose a scalable stochastic version of attention that is easy to implement and optimize. We construct simplex-constrained attention distributions by normalizing reparameterizable distributions, making the training process differentiable. We learn their parameters in a Bayesian framework where a data-dependent prior is introduced for regularization. We apply the proposed stochastic attention modules to various attention-based models, with applications to graph node classification, visual question answering, image captioning, machine translation, and language understanding. Our experiments show the proposed method brings consistent improvements over the corresponding baselines.
Word Shape Matters: Robust Machine Translation with Visual Embedding
Wang, Haohan, Zhang, Peiyan, Xing, Eric P.
Neural machine translation has achieved remarkable empirical performance over standard benchmark datasets, yet recent evidence suggests that the models can still fail easily dealing with substandard inputs such as misspelled words, To overcome this issue, we introduce a new encoding heuristic of the input symbols for character-level NLP models: it encodes the shape of each character through the images depicting the letters when printed. We name this new strategy visual embedding and it is expected to improve the robustness of NLP models because humans also process the corpus visually through printed letters, instead of machinery one-hot vectors. Empirically, our method improves models' robustness against substandard inputs, even in the test scenario where the models are tested with the noises that are beyond what is available during the training phase.
Facebook AI can translate directly between any of 100 languages
Facebook has developed an artificial intelligence capable of accurately translating between any pair of 100 languages without relying on first translating to English, as many existing systems do. The AI outperforms such systems by 10 points on a 100-point scale used by academics to automatically evaluate the quality of machine translations. Translations produced by the model were also assessed by humans, who scored it as around 90 per cent accurate. Facebook's system was trained on a data set of 7.5 billion sentence pairs gathered from the web across 100 languages, though not all the languages had an equal number of sentence pairs. "What I really was interested in was cutting out English as a middle man. Globally there are plenty of regions where they speak two languages that aren't English," says Angela Fan of Facebook AI, who led the work.
Diving Deep into Context-Aware Neural Machine Translation
Huo, Jingjing, Herold, Christian, Gao, Yingbo, Dahlmann, Leonard, Khadivi, Shahram, Ney, Hermann
Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e.g., document-level translation, or having meta-information. Although there exist various architectures and analyses, the effectiveness of different context-aware NMT models is not well explored yet. This paper analyzes the performance of document-level NMT models on four diverse domains with a varied amount of parallel document-level bilingual data. We conduct a comprehensive set of experiments to investigate the impact of document-level NMT. We find that there is no single best approach to document-level NMT, but rather that different architectures come out on top on different tasks. Looking at task-specific problems, such as pronoun resolution or headline translation, we find improvements in the context-aware systems, even in cases where the corpus-level metrics like BLEU show no significant improvement. We also show that document-level back-translation significantly helps to compensate for the lack of document-level bi-texts.
A lifetime of WhiteSmoke's grammar tool is now $40
Your first impression is your strongest, and with the amount of virtual communication we use these days, you best be sure that your writing is top-notch. People will judge your character based on how well you can articulate your thoughts on paper, so your writing can have a major impact on how you interact with your colleagues, professors, clients, etc. Yes, that includes emails and Slack messages as well. No one becomes an amazing writer overnight, though. Even then, the best writers will make grammatical errors here and there. It takes years of practice to become a great writer, but that doesn't mean you can't ask for help along the way.
Meta-Learning for Low-Resource Unsupervised Neural MachineTranslation
Tae, Yunwon, Park, Cheonbok, Kim, Taehee, Yang, Soyoung, Khan, Mohammad Azam, Park, Eunjeong, Qin, Tao, Choo, Jaegul
Unsupervised machine translation, which utilizes unpaired monolingual corpora as training data, has achieved comparable performance against supervised machine translation. However, it still suffers from data-scarce domains. To address this issue, this paper presents a meta-learning algorithm for unsupervised neural machine translation (UNMT) that trains the model to adapt to another domain by utilizing only a small amount of training data. We assume that domain-general knowledge is a significant factor in handling data-scarce domains. Hence, we extend the meta-learning algorithm, which utilizes knowledge learned from high-resource domains to boost the performance of low-resource UNMT. Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores. Extensive experimental results show that our proposed algorithm is pertinent for fast adaptation and consistently outperforms other baseline models.