Goto

Collaborating Authors

 Machine Translation


Evaluating Sequence-to-Sequence Learning Models for If-Then Program Synthesis

arXiv.org Machine Learning

Implementing enterprise process automation often requires significant technical expertise and engineering effort. It would be beneficial for non-technical users to be able to describe a business process in natural language and have an intelligent system generate the workflow that can be automatically executed. A building block of process automations are If-Then programs. In the consumer space, sites like IFTTT and Zapier allow users to create automations by defining If-Then programs using a graphical interface. We explore the efficacy of modeling If-Then programs as a sequence learning task. We find Seq2Seq approaches have high potential (performing strongly on the Zapier recipes) and can serve as a promising approach to more complex program synthesis challenges.


Time-aware Large Kernel Convolutions

arXiv.org Machine Learning

To date, most state-of-the-art sequence modelling architectures use attention to build generative models for language based tasks. Some of these models use all the available sequence tokens to generate an attention distribution which results in time complexity of $O(n^2)$. Alternatively, they utilize depthwise convolutions with softmax normalized kernels of size $k$ acting as a limited-window self-attention, resulting in time complexity of $O(k{\cdot}n)$. In this paper, we introduce Time-aware Large Kernel (TaLK) Convolutions, a novel adaptive convolution operation that learns to predict the size of a summation kernel instead of using the fixed-sized kernel matrix. This method yields a time complexity of $O(n)$, effectively making the sequence encoding process linear to the number of tokens. We evaluate the proposed method on large-scale standard machine translation and language modelling datasets and show that TaLK Convolutions constitute an efficient improvement over other attention/convolution based approaches.


CCMatrix: A billion-scale bitext data set for training translation models

#artificialintelligence

CCMatrix is the largest data set of high-quality, web-based bitexts for training translation models. With more than 4.5 billion parallel sentences in 576 language pairs pulled from snapshots of the CommonCrawl public data set, CCMatrix is more than 50 times larger than the WikiMatrix corpus that we shared last year. Gathering a data set of this size required modifying our previous bitext mining approach used for WikiMatrix, assuming that the translation of one sentence could be found anywhere on CommonCrawl, which functions as an open archive of the internet. To address the significant computational challenges posed by comparing billions of sentences to determine which ones are mutual translations, we used massively parallel processing, as well as our highly efficient FAISS library for fast similarity searches. We're sharing details about how we created CCMatrix, and the tools needed for other researchers to reproduce our results and use this corpus for their work.


Translating Web Search Queries into Natural Language Questions

arXiv.org Artificial Intelligence

Users often query a search engine with a specific question in mind and often these queries are keywords or sub-sentential fragments. For example, if the users want to know the answer for "What's the capital of USA", they will most probably query "capital of USA" or "USA capital" or some keyword-based variation of this. For example, for the user entered query "capital of USA", the most probable question intent is "What's the capital of USA?". In this paper, we are proposing a method to generate well-formed natural language question from a given keyword-based query, which has the same question intent as the query. Conversion of keyword-based web query into a well-formed question has lots of applications, with some of them being in search engines, Community Question Answering (CQA) website and bots communication. We found a synergy between query-to-question problem with standard machine translation(MT) task. We have used both Statistical MT (SMT) and Neural MT (NMT) models to generate the questions from the query. We have observed that MT models perform well in terms of both automatic and human evaluation.


Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

arXiv.org Machine Learning

Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition. We study the related issue of receiving infinite-length sequences from a recurrent language model when using common decoding algorithms. To analyze this issue, we first define inconsistency of a decoding algorithm, meaning that the algorithm can yield an infinite-length sequence that has zero probability under the model. We prove that commonly used incomplete decoding algorithms - greedy search, beam search, top-k sampling, and nucleus sampling - are inconsistent, despite the fact that recurrent language models are trained to produce sequences of finite length. Based on these insights, we propose two remedies which address inconsistency: consistent variants of top-k and nucleus sampling, and a self-terminating recurrent language model. Empirical results show that inconsistency occurs in practice, and that the proposed methods prevent inconsistency.


Translate this: How real-time translation breaks down barriers when you don't speak the language

USATODAY - Tech Top Stories

In the sci-fi world crafted by Douglas Adams in "The Hitchhiker's Guide to the Galaxy," you'd just slap a bright yellow Babel fish in your ear and simply be able to understand any mix of languages around you. While we aren't quite there yet, language is becoming less of a barrier than in generations past. "Understanding is going to become the new normal," says Dave Limp, Amazon's senior vice president of devices and services. Kids "will never grow up in world where they aren't able to hear any language. To that end, today's technology is helping to interpret and translate the world around us in ways that are nearing seamless and in real time. From apps on your phone to increasingly multilingual virtual personal assistants, communicating as a tourist or with clients, friends and family who don't speak the same language is less of a challenge. Yet for all the authentique gains achieved in translation over the past several years, don't count on your phone, smart speaker, PC or ear device ...


Smart Language Translation Solutions and Software for Enterprise - Lingmo International

#artificialintelligence

We understand when the language barrier is removed it is easier to communicate with your foreign speaking consumers. We can help you speak to your customers in 80 languages and scale into new international markets with our smart translation solutions.


Neural Machine Translation System of Indic Languages -- An Attention based Approach

arXiv.org Machine Learning

Neural machine translation (NMT) is a recent and effective technique which led to remarkable improvements in comparison of conventional machine translation techniques. Proposed neural machine translation model developed for the Gujarati language contains encoder-decoder with attention mechanism. In India, almost all the languages are originated from their ancestral language - Sanskrit. They are having inevitable similarities including lexical and named entity similarity. Translating into Indic languages is always be a challenging task. In this paper, we have presented the neural machine translation system (NMT) that can efficiently translate Indic languages like Hindi and Gujarati that together covers more than 58.49 percentage of total speakers in the country. We have compared the performance of our NMT model with automatic evaluation matrices such as BLEU, perplexity and TER matrix. The comparison of our network with Google translate is also presented where it outperformed with a margin of 6 BLEU score on English-Gujarati translation.


Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker

arXiv.org Machine Learning

To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the performance of generative text-to-SQL models by extracting the best SQL query from the beam output predicted by the text-to-SQL generator, resulting in improved performance in the cases where the best query was in the candidate list, but not at the top of the list. We build the re-ranker as a schema agnostic BERT fine-tuned classifier. We analyze relative strengths of the text-to-SQL and re-ranker models across different query hardness levels, and suggest how to combine the two models for optimal performance. We demonstrate the effectiveness of the re-ranker by applying it to two state-of-the-art text-to-SQL models, and achieve top 4 score on the Spider leaderboard at the time of writing this article.


Word Sense Disambiguation

#artificialintelligence

The history and development of Artificial Intelligence has seen numerous peaks and troughs. Hype around what machines can accomplish lead to boosts in AI funding while unmet expectations cripple the industry until the next breakthrough. The term AI Winter refers to periods in history of reduced funding and interest in artificial intelligence development. During the cold war, there was an increased interest in Machine Translation to automate the translation of Russian documents into English. This time period also coincided with massive strides in linguistic developments and the early career of the famed linguist Noam Chomsky.