Machine Translation
Adapting machine translation models to new genres
Neural machine translation systems are often optimized to perform well for specific text genres or domains, such as newspaper articles, user manuals, or customer support chats. In industrial settings with hundreds of language pairs to serve, however, a single translation system per language pair, which performs well across different text domains, is more efficient to deploy and maintain. Additionally, service providers may not know in advance which domains customers will be interested in. At this year's Conference on Empirical Methods in Natural Language Processing (EMNLP), we are presenting a new approach to multidomain adaptation for neural translation models, or adapting an existing model to new domains while maintaining translation quality in the original domain. Our approach provides a better trade-off between performance on old and new tasks than its predecessors do.
Transformer Based Bengali Chatbot Using General Knowledge Dataset
Masum, Abu Kaisar Mohammad, Abujar, Sheikh, Akter, Sharmin, Ria, Nushrat Jahan, Hossain, Syed Akhter
An AI chatbot provides an impressive response after learning from the trained dataset. In this decade, most of the research work demonstrates that deep neural models superior to any other model. RNN model regularly used for determining the sequence-related problem like a question and it answers. This approach acquainted with everyone as seq2seq learning. In a seq2seq model mechanism, it has encoder and decoder. The encoder embedded any input sequence, and the decoder embedded output sequence. For reinforcing the seq2seq model performance, attention mechanism added into the encoder and decoder. After that, the transformer model has introduced itself as a high-performance model with multiple attention mechanism for solving the sequence-related dilemma. This model reduces training time compared with RNN based model and also achieved state-of-the-art performance for sequence transduction. In this research, we applied the transformer model for Bengali general knowledge chatbot based on the Bengali general knowledge Question Answer (QA) dataset. It scores 85.0 BLEU on the applied QA data. To check the comparison of the transformer model performance, we trained the seq2seq model with attention on our dataset that scores 23.5 BLEU.
Flight Demand Forecasting with Transformers
Wang, Liya, Mykityshyn, Amy, Johnson, Craig, Cheng, Jillian
Transformers have become the de-facto standard in the natural language processing (NLP) field. They have also gained momentum in computer vision and other domains. Transformers can enable artificial intelligence (AI) models to dynamically focus on certain parts of their input and thus reason more effectively. Inspired by the success of transformers, we adopted this technique to predict strategic flight departure demand in multiple horizons. This work was conducted in support of a MITRE-developed mobile application, Pacer, which displays predicted departure demand to general aviation (GA) flight operators so they can have better situation awareness of the potential for departure delays during busy periods. Field demonstrations involving Pacer's previously designed rule-based prediction method showed that the prediction accuracy of departure demand still has room for improvement. This research strives to improve prediction accuracy from two key aspects: better data sources and robust forecasting algorithms. We leveraged two data sources, Aviation System Performance Metrics (ASPM) and System Wide Information Management (SWIM), as our input. We then trained forecasting models with temporal fusion transformer (TFT) for five different airports. Case studies show that TFTs can perform better than traditional forecasting methods by large margins, and they can result in better prediction across diverse airports and with better interpretability.
Lingua Custodia's participation at the WMT 2021 Machine Translation using Terminologies shared task
Ailem, Melissa, Liu, Jinghsu, Qader, Raheel
This paper describes Lingua Custodia's submission to the WMT21 shared task on machine translation using terminologies. We consider three directions, namely English to French, Russian, and Chinese. We rely on a Transformer-based architecture as a building block, and we explore a method which introduces two main changes to the standard procedure to handle terminologies. The first one consists in augmenting the training data in such a way as to encourage the model to learn a copy behavior when it encounters terminology constraint terms. The second change is constraint token masking, whose purpose is to ease copy behavior learning and to improve model generalization. Empirical results show that our method satisfies most terminology constraints while maintaining high translation quality.
UQuAD1.0: Development of an Urdu Question Answering Training Data for Machine Reading Comprehension
In recent years, low-resource Machine Reading Comprehension (MRC) has made significant progress, with models getting remarkable performance on various language datasets. However, none of these models have been customized for the Urdu language. This work explores the semi-automated creation of the Urdu Question Answering Dataset (UQuAD1.0) by combining machine-translated SQuAD with human-generated samples derived from Wikipedia articles and Urdu RC worksheets from Cambridge O-level books. UQuAD1.0 is a large-scale Urdu dataset intended for extractive machine reading comprehension tasks consisting of 49k question Answers pairs in question, passage, and answer format. In UQuAD1.0, 45000 pairs of QA were generated by machine translation of the original SQuAD1.0 and approximately 4000 pairs via crowdsourcing. In this study, we used two types of MRC models: rule-based baseline and advanced Transformer-based models. However, we have discovered that the latter outperforms the others; thus, we have decided to concentrate solely on Transformer-based architectures. Using XLMRoBERTa and multi-lingual BERT, we acquire an F1 score of 0.66 and 0.63, respectively.
How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus
Park, Chanjun, Lee, Seolhwa, Moon, Hyeonseok, Eo, Sugyeong, Seo, Jaehyung, Lim, Heuiseok
This paper proposes a tool for efficiently constructing high-quality parallel corpora with minimizing human labor and making this tool publicly available. Our proposed construction process is based on neural machine translation (NMT) to allow for it to not only coexist with human translation, but also improve its efficiency by combining data quality control with human translation in a data-centric approach.
Understanding How Encoder-Decoder Architectures Attend
Aitken, Kyle, Ramasesh, Vinay V, Cao, Yuan, Maheswaranathan, Niru
Encoder-decoder networks with attention have proven to be a powerful way to solve many sequence-to-sequence tasks. In these networks, attention aligns encoder and decoder states and is often used for visualizing network behavior. However, the mechanisms used by networks to generate appropriate attention matrices are still mysterious. Moreover, how these mechanisms vary depending on the particular architecture used for the encoder and decoder (recurrent, feed-forward, etc.) are also not well understood. In this work, we investigate how encoder-decoder networks solve different sequence-to-sequence tasks. We introduce a way of decomposing hidden states over a sequence into temporal (independent of input) and input-driven (independent of sequence position) components. This reveals how attention matrices are formed: depending on the task requirements, networks rely more heavily on either the temporal or input-driven components. These findings hold across both recurrent and feed-forward architectures despite their differences in forming the temporal components. Overall, our results provide new insight into the inner workings of attention-based encoder-decoder networks.
Visually Grounded Reasoning across Languages and Cultures
Liu, Fangyu, Bugliarello, Emanuele, Ponti, Edoardo Maria, Reddy, Siva, Collier, Nigel, Elliott, Desmond
The design of widespread vision-and-language datasets and pre-trained encoders directly adopts, or draws inspiration from, the concepts and images of ImageNet. While one can hardly overestimate how much this benchmark contributed to progress in computer vision, it is mostly derived from lexical databases and image queries in English, resulting in source material with a North American or Western European bias. Therefore, we devise a new protocol to construct an ImageNet-style hierarchy representative of more languages and cultures. In particular, we let the selection of both concepts and images be entirely driven by native speakers, rather than scraping them automatically. Specifically, we focus on a typologically diverse set of languages, namely, Indonesian, Mandarin Chinese, Swahili, Tamil, and Turkish. On top of the concepts and images obtained through this new protocol, we create a multilingual dataset for {M}ulticultur{a}l {R}easoning over {V}ision and {L}anguage (MaRVL) by eliciting statements from native speaker annotators about pairs of images. The task consists of discriminating whether each grounded statement is true or false. We establish a series of baselines using state-of-the-art models and find that their cross-lingual transfer performance lags dramatically behind supervised performance in English. These results invite us to reassess the robustness and accuracy of current state-of-the-art models beyond a narrow domain, but also open up new exciting challenges for the development of truly multilingual and multicultural systems.
How AI is booming in Berlin
With a wave of new startups and cutting-edge research, Berlin is taking part in this revolution. With a total of more than 300 companies, Berlin is a European melting pot for innovators and visionaries in the field of AI. AI is a buzzword – often associated with sci-fi dystopian scenarios, like robots outsmarting mankind, flying cars, or a suave speaking computer operating system we fall in love with. But as we now know, we're surrounded by AI every day. Helpful chat bots, parking assistants or face recognition, just to name a few examples, are making our lives more convenient.
AI: The Inverse Tower of Babbel
The Old Testament's'Tower of Babel' story is an origin myth that tries to explain why humanity doesn't speak a single, universal language. According to the Bible, a united human race that speaks the same language arrived in the land of Shinar and decided to build a tower tall enough to reach heaven. Annoyed -- once again, it can probably be said -- by humanity's growing arrogance and budding hubris, God confounded humanity's speech, dividing its people into separate linguistic groups that couldn't understand one another. Just to ensure they don't start comparing and contrasting their languages to reach some form of translating breakthrough, God dispersed humankind to all corners of the earth and set the stage for what is today a world of 6,500 languages. For God, a job well done and the situation remained static for centuries, that was until tribes starting trading with each other, armies started fighting one another, and diplomats initiated conflict resolution measures to try to end the wars that were often started due to misunderstandings of one kind or another.