Goto

Collaborating Authors

 Machine Translation


Neural Machine Translation System of Indic Languages -- An Attention based Approach

arXiv.org Machine Learning

Neural machine translation (NMT) is a recent and effective technique which led to remarkable improvements in comparison of conventional machine translation techniques. Proposed neural machine translation model developed for the Gujarati language contains encoder-decoder with attention mechanism. In India, almost all the languages are originated from their ancestral language - Sanskrit. They are having inevitable similarities including lexical and named entity similarity. Translating into Indic languages is always be a challenging task. In this paper, we have presented the neural machine translation system (NMT) that can efficiently translate Indic languages like Hindi and Gujarati that together covers more than 58.49 percentage of total speakers in the country. We have compared the performance of our NMT model with automatic evaluation matrices such as BLEU, perplexity and TER matrix. The comparison of our network with Google translate is also presented where it outperformed with a margin of 6 BLEU score on English-Gujarati translation.


Bertrand-DR: Improving Text-to-SQL using a Discriminative Re-ranker

arXiv.org Machine Learning

To access data stored in relational databases, users need to understand the database schema and write a query using a query language such as SQL. To simplify this task, text-to-SQL models attempt to translate a user's natural language question to corresponding SQL query. Recently, several generative text-to-SQL models have been developed. We propose a novel discriminative re-ranker to improve the performance of generative text-to-SQL models by extracting the best SQL query from the beam output predicted by the text-to-SQL generator, resulting in improved performance in the cases where the best query was in the candidate list, but not at the top of the list. We build the re-ranker as a schema agnostic BERT fine-tuned classifier. We analyze relative strengths of the text-to-SQL and re-ranker models across different query hardness levels, and suggest how to combine the two models for optimal performance. We demonstrate the effectiveness of the re-ranker by applying it to two state-of-the-art text-to-SQL models, and achieve top 4 score on the Spider leaderboard at the time of writing this article.


Word Sense Disambiguation

#artificialintelligence

The history and development of Artificial Intelligence has seen numerous peaks and troughs. Hype around what machines can accomplish lead to boosts in AI funding while unmet expectations cripple the industry until the next breakthrough. The term AI Winter refers to periods in history of reduced funding and interest in artificial intelligence development. During the cold war, there was an increased interest in Machine Translation to automate the translation of Russian documents into English. This time period also coincided with massive strides in linguistic developments and the early career of the famed linguist Noam Chomsky.


Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance

arXiv.org Machine Learning

Cross-lingual document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other. Such aligned data can be used for a variety of NLP tasks from training cross-lingual representations to mining parallel bitexts for machine translation training. In this paper we develop an unsupervised scoring function that leverages cross-lingual sentence embeddings to compute the semantic distance between documents in different languages. These semantic distances are then used to guide a document alignment algorithm to properly pair cross-lingual web documents across a variety of low, mid, and high-resource language pairs. Recognizing that our proposed scoring function and other state of the art methods are computationally intractable for long web documents, we utilize a more tractable greedy algorithm that performs comparably. We experimentally demonstrate that our distance metric performs better alignment than current baselines outperforming them by 7% on high-resource language pairs, 15% on mid-resource language pairs, and 22% on low-resource language pairs


AMR Similarity Metrics from Principles

arXiv.org Artificial Intelligence

Different metrics have been proposed to compare Abstract Meaning Representation (AMR) graphs. The canonical Smatch metric (Cai and Knight, 2013) aligns variables from one graph to another and compares the matching triples. The recently released SemBleu metric (Song and Gildea, 2019) is based on the machine-translation metric Bleu (Papineni et al., 2002), increasing computational efficiency by ablating a variable-alignment step and aiming at capturing more global graph properties. Our aims are threefold: i) we establish criteria that allow us to perform a principled comparison between metrics of symbolic meaning representations like AMR; ii) we undertake a thorough analysis of Smatch and SemBleu where we show that the latter exhibits some undesirable properties. E.g., it violates the identity of indiscernibles rule and introduces biases that are hard to control; iii) we propose a novel metric S2match that is more benevolent to only very slight meaning deviations and targets the fulfilment of all established criteria. We assess its suitability and show its advantages over Smatch and SemBleu.


Unsupervised Multilingual Alignment using Wasserstein Barycenter

arXiv.org Machine Learning

We study unsupervised multilingual alignment, the problem of finding word-to-word translations between multiple languages without using any parallel data. One popular strategy is to reduce multilingual alignment to the much simplified bilingual setting, by picking one of the input languages as the pivot language that we transit through. However, it is well-known that transiting through a poorly chosen pivot language (such as English) may severely degrade the translation quality, since the assumed transitive relations among all pairs of languages may not be enforced in the training process. Instead of going through a rather arbitrarily chosen pivot language, we propose to use the Wasserstein barycenter as a more informative ''mean'' language: it encapsulates information from all languages and minimizes all pairwise transportation costs. We evaluate our method on standard benchmarks and demonstrate state-of-the-art performances.


Generating Representative Headlines for News Stories

arXiv.org Artificial Intelligence

Millions of news articles are published online every day, which can be overwhelming for readers to follow. Grouping articles that are reporting the same event into news stories is a common way of assisting readers in their news consumption. However, it remains a challenging research problem to efficiently and effectively generate a representative headline for each story. Automatic summarization of a document set has been studied for decades, while few studies have focused on generating representative headlines for a set of articles. Unlike summaries, which aim to capture most information with least redundancy, headlines aim to capture information jointly shared by the story articles in short length, and exclude information that is too specific to each individual article. In this work, we study the problem of generating representative headlines for news stories. We develop a distant supervision approach to train large-scale generation models without any human annotation. This approach centers on two technical components. First, we propose a multi-level pre-training framework that incorporates massive unlabeled corpus with different quality-vs.-quantity balance at different levels. We show that models trained within this framework outperform those trained with pure human curated corpus. Second, we propose a novel self-voting-based article attention layer to extract salient information shared by multiple articles. We show that models that incorporate this layer are robust to potential noises in news stories and outperform existing baselines with or without noises. We can further enhance our model by incorporating human labels, and we show our distant supervision approach significantly reduces the demand on labeled data.


Otter.ai expands in Japan in partnership with NTT DOCOMO

#artificialintelligence

Otter.ai to Bring AI-Powered Meeting Note Collaboration Service to Japan in partnership with NTT DOCOMO Partnership includes Investment and Customer Trials of Otter's Real-Time Transcription Los Altos, CA, January 23, 2020 –Otter.ai DOCOMO made a strategic investment in Otter through its wholly-owned subsidiary NTT DOCOMO Ventures, Inc. and announced plans for its AI-based translation service subsidiary to integrate Otter's meeting note collaboration into its offering to provide highly accurate English transcripts translated into Japanese. As a part of Otter's customer engagement with DOCOMO the Otter Voice Meeting Notes application is being used on a trial basis in Berlitz Corporation's English language classes in Japan. Students use Otter to transcribe and review the content of lessons, click on sections of text, and initiate voice playback. DOCOMO, Otter.ai and Berlitz are expanding their collaboration in language education to verify Otter's effectiveness in the study of English DOCOMO is featuring Otter during demonstrations at the DOCOMO Open House 2020, taking place in the Tokyo Big Sight exhibition complex January 23 and 24, 2020.


Semi-Autoregressive Training Improves Mask-Predict Decoding

arXiv.org Machine Learning

The recently proposed mask-predict decoding algorithm has narrowed the performance gap between semi-autoregressive machine translation models and the traditional left-to-right approach. We introduce a new training method for conditional masked language models, SMART, which mimics the semi-autoregressive behavior of mask-predict, producing training examples that contain model predictions as part of their inputs. Models trained with SMART produce higher-quality translations when using mask-predict decoding, effectively closing the remaining performance gap with fully autoregressive models.


Can Simple Neuron Interactions Capture Complex Linguistic Phenomena?

#artificialintelligence

Deep neural machine translation (NMT) can learn representations containing linguistic information. And despite the differences between various models, they all tend to learn similar properties. This phenomena got researchers wondering whether the learned information is fully distributed and embedded to individual neurons. Recent research results confirmed that hypothesis, revealing that simple properties such as coordinating conjunctions and determiners can be attributed to individual neurons, while more complex linguistic properties such as syntax and semantics are distributed across multiple neurons. Following on this, researchers from The Chinese University of Hong Kong, Tencent AI Lab and University of Macau have proposed a new neuron interaction based representation composition for NMT.