Goto

Collaborating Authors

 Machine Translation


Mechanism-Aware Neural Machine for Dialogue Response Generation

AAAI Conferences

To the same utterance, people's responses in everyday dialogue may be diverse largely in terms of content semantics, speaking styles, communication intentions and so on. Previous generative conversational models ignore these 1-to-n relationships between a post to its diverse responses, and tend to return high-frequency but meaningless responses. In this study we propose a mechanism-aware neural machine for dialogue response generation. It assumes that there exists some latent responding mechanisms, each of which can generate different responses for a single input post. With this assumption we model different responding mechanisms as latent embeddings, and develop a encoder-diverter-decoder framework to train its modules in an end-to-end fashion. With the learned latent mechanisms, for the first time these decomposed modules can be used to encode the input into mechanism-aware context, and decode the responses with the controlled generation styles and topics. Finally, the experiments with human judgements, intuitive examples, detailed discussions demonstrate the quality and diversity of the generated responses with 9.80% increase of acceptable ratio over the best of six baseline methods.


Bilingual Lexicon Induction from Non-Parallel Data with Minimal Supervision

AAAI Conferences

Building bilingual lexica from non-parallel data is a long-standing natural language processing research problem that could benefit thousands of resource-scarce languages which lack parallel data. Recent advances of continuous word representations have opened up new possibilities for this task, e.g. by establishing cross-lingual mapping between word embeddings via a seed lexicon. The method is however unreliable when there are only a limited number of seeds, which is a reasonable setting for resource-scarce languages. We tackle the limitation by introducing a novel matching mechanism into bilingual word representation learning. It captures extra translation pairs exposed by the seeds to incrementally improve the bilingual word embeddings. In our experiments, we find the matching mechanism to substantially improve the quality of the bilingual vector space, which in turn allows us to induce better bilingual lexica with seeds as few as 10.


Topic Aware Neural Response Generation

AAAI Conferences

We consider incorporating topic information into a sequence-to-sequence framework to generate informative and interesting responses for chatbots. To this end, we propose a topic aware sequence-to-sequence (TA-Seq2Seq) model. The model utilizes topics to simulate prior human knowledge that guides them to form informative and interesting responses in conversation, and leverages topic information in generation by a joint attention mechanism and a biased generation probability. The joint attention mechanism summarizes the hidden vectors of an input message as context vectors by message attention and synthesizes topic vectors by topic attention from the topic words of the message obtained from a pre-trained LDA model, with these vectors jointly affecting the generation of words in decoding. To increase the possibility of topic words appearing in responses, the model modifies the generation probability of topic words by adding an extra probability item to bias the overall distribution. Empirical studies on both automatic evaluation metrics and human annotations show that TA-Seq2Seq can generate more informative and interesting responses, significantly outperforming state-of-the-art response generation models.


Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation

AAAI Conferences

Neural machine translation (NMT) heavily relies on word-level modelling to learn semantic representations of input sentences.However, for languages without natural word delimiters (e.g., Chinese) where input sentences have to be tokenized first,conventional NMT is confronted with two issues:1) it is difficult to find an optimal tokenization granularity for source sentence modelling, and2) errors in 1-best tokenizations may propagate to the encoder of NMT.To handle these issues, we propose word-lattice based Recurrent Neural Network (RNN) encoders for NMT,which generalize the standard RNN to word lattice topology.The proposed encoders take as input a word lattice that compactly encodes multiple tokenizations, and learn to generate new hidden states from arbitrarily many inputs and hidden states in preceding time steps.As such, the word-lattice based encoders not only alleviate the negative impact of tokenization errors but also are more expressive and flexible to embed input sentences.Experiment results on Chinese-English translation demonstrate the superiorities of the proposed encoders over the conventional encoder.


Translation Prediction with Source Dependency-Based Context Representation

AAAI Conferences

Learning context representations is very promising to improve translation results, particularly through neural networks. Previous efforts process the context words sequentially and neglect their internal syntactic structure. In this paper, we propose a novel neural network based on bi-convolutional architecture to represent the source dependency-based context for translation prediction. The proposed model is able to not only encode the long-distance dependencies but also capture the functional similarities for better translation prediction (i.e., ambiguous words translation and word forms translation). Examined by a large-scale Chinese-English translation task, the proposed approach achieves a significant improvement (of up to +1.9 BLEU points) over the baseline system, and meanwhile outperforms a number of context-enhanced comparison system.


Joint Copying and Restricted Generation for Paraphrase

AAAI Conferences

Many natural language generation tasks, such as abstractive summarization and text simplification, are paraphrase-orientated. In these tasks, copying and rewriting are two main writing modes. Most previous sequence-to-sequence (Seq2Seq) models use a single decoder and neglect this fact. In this paper, we develop a novel Seq2Seq model to fuse a copying decoder and a restricted generative decoder. The copying decoder finds the position to be copied based on a typical attention model. The generative decoder produces words limited in the source-specific vocabulary. To combine the two decoders and determine the final output, we develop a predictor to predict the mode of copying or rewriting. This predictor can be guided by the actual writing mode in the training data. We conduct extensive experiments on two different paraphrase datasets. The result shows that our model outperforms the state-of-the-art approaches in terms of both informativeness and language quality.


Neural Machine Translation with Reconstruction

AAAI Conferences

Although end-to-end Neural Machine Translation (NMT) has achieved remarkable progress in the past two years, it suffers from a major drawback: translations generated by NMT systems often lack of adequacy. It has been widely observed that NMT tends to repeatedly translate some source words while mistakenly ignoring other words. To alleviate this problem, we propose a novel encoder-decoder-reconstructor framework for NMT. The reconstructor, incorporated into the NMT model, manages to reconstruct the input source sentence from the hidden layer of the output target sentence, to ensure that the information in the source side is transformed to the target side as much as possible. Experiments show that the proposed framework significantly improves the adequacy of NMT output and achieves superior translation result over state-of-the-art NMT and statistical MT systems.


Impact of Artificial Intelligence on Cyber Security

Huffington Post - Tech news and opinion

Machine intelligence is everywhere in facial recognition at airports to emotional sensing algorithms; machine generated Art work; legal and medical advisory search to sometimes fowl mouthed social chat bots. The Google company AI team recently announced they developed Google Neural Machine Translation system, GNMT, using a new technique that is improving results to near human translation speed accuracy. These advances that Google describe as machine translation at production scale, are testament to the rapid real-time advancement of AI into human experience and intelligence as well as beyond human capabilities. Andrew Ng of Stanford and Chief Scientist at Baidu Research famously said that word translation of 95% is 1 in every 20 words would likely be wrong, going to 99% is game changing. Andrew was quoted in a recent HBR article saying, "If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future."


Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization

arXiv.org Machine Learning

This paper tackles the reduction of redundant repeating generation that is often observed in RNN-based encoder-decoder models. Our basic idea is to jointly estimate the upper-bound frequency of each target vocabulary in the encoder and control the output words based on the estimation in the decoder. Our method shows significant improvement over a strong RNN-based encoder-decoder baseline and achieved its best results on an abstractive summarization benchmark.


Google's AI translation tool seems to have invented its own language – World Economic Forum

#artificialintelligence

Back in September 2016, Google launched its Neural Machine Translation (GNMT) system, which uses deep learning to deliver more natural translations between languages. Google Translate originally supported only a handful of languages when it launched 10 years ago; today that number has risen to 103. Creating a computer system to translate multiple languages is complex. The people at Google who built it wanted to find out just how clever their system was. So they came up with a challenge.