Goto

Collaborating Authors

 Machine Translation


The rise of machine intelligence in agriculture

#artificialintelligence

Future generations will likely look back at the development of machine learning as a turning point. It certainly is convenient to dispense with keyboards and touchscreens in favor of using ordinary speech to tell your iPhone, Android or Alexa device what you want it to do. But machine learning's far more consequential contributions to society will be found in the fields of agriculture. Those contributions cannot come soon enough. By 2050 an estimated 9.7 billion people[1] are going to need to be fed, which means that farmers have to increase their output to cover the 200,000 people who are added to the global population each and every day.


MaskGAN: Better Text Generation via Filling in the______

arXiv.org Machine Learning

Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality of the generated text. Additionally, these models are typically trained via maxi- mum likelihood and teacher forcing. These methods are well-suited to optimizing perplexity but can result in poor sample quality since generating text requires conditioning on sequences of words that may have never been observed at training time. We propose to improve sample quality using Generative Adversarial Networks (GANs), which explicitly train the generator to produce high quality samples and have shown a lot of success in image generation. GANs were originally designed to output differentiable values, so discrete language generation is challenging for them. We claim that validation perplexity alone is not indicative of the quality of text generated by a model. We introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. We show qualitatively and quantitatively, evidence that this produces more realistic conditional and unconditional text samples compared to a maximum likelihood trained model.


How AI Has Started To Impact Our Work As Designers

#artificialintelligence

There is a lot of conversation happening around Artificial Intelligence, Machine Learning, and using algorithms to shape the future of Design and the role of the designer. But how is that changing the way we work in the near future? "The end is near", according to specialists in robotics and artificial intelligence. Not really the end of the world itself, but the fact robots will be taking over a portion of jobs currently occupied by humans. Futurist Thomas Frey, as an example, predicted in a TEDx talk that 2 billion jobs will have disappeared by 2030.


OpenNMT - Open-Source Neural Machine Translation

#artificialintelligence

SYSTRAN and HarvardNLP are very pleased to hold the first OpenNMT Workshop in Paris on March 2nd at Station F, followed by the first ever OpenNMT Hackathon on March 3rd at Tรฉlรฉcom ParisTech. OpenNMT is an Open Source project providing neural technologies for different tasks such as automatic machine translation, text generation and summarization. The OpenNMT project is a collection of implementations on multiple frameworks designed to be simple to use and easy to extend, while maintaining efficiency and state-of-the-art accuracy. Registration is FREE and OPEN to both the OpenNMT community as well as anyone interested in Deep Learning applications for natural language processing. During the daylong hackathon, we will provide hands-on training, but also development sessions to share good development practices and to kick off development of new features or interfaces.


Age of AI Conference 2018 โ€“ Day 1 Highlights

@machinelearnbot

These are some of the highlights from the Day 1 of the Age of AI Conference, held on January 31 and February 1, 2018, at the Regency Ballroom in San Francisco. The Conference owes its origins in the San Francisco Artificial Intelligence meetup that Emil Mikhailov started for the interested ones to learn, network and share. The community now boasts of 4,700 members and has previously hosted heavyweights like Andrew Ng and Nvidia CEO Jensen Huang. The Regency Ballroom boasts good location and acoustics. The best part was the technical focus of the Conference, well punctuated with some'global minima' but thought-provokingtouches.I will strive to do some justice to the rich technical content.


Differentiable Dynamic Programming for Structured Prediction and Attention

arXiv.org Machine Learning

Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.


Is your software racist?

#artificialintelligence

Late last year, a St. Louis tech executive named Emre ลžarbak noticed something strange about Google Translate. He was translating phrases from Turkish -- a language that uses a single gender-neutral pronoun "o" instead of "he" or "she." But when he asked Google's tool to turn the sentences into English, they seemed to read like a children's book out of the 1950's. The ungendered Turkish sentence "o is a nurse" would become "she is a nurse," while "o is a doctor" would become "he is a doctor." The website Quartz went on to compose a sort-of poem highlighting some of these phrases; Google's translation program decided that soldiers, doctors and entrepreneurs were men, while teachers and nurses were women.


The Algorithm That Helped Google Translate Become Sexist

#artificialintelligence

StitchFix CEO Katrina Lake posted this on Twitter on the day of her company's IPO in 2017. Automated translation is using the same kind of models for suggesting words that are sometimes laced with bias. Parents know one particular challenge of raising kids all too well: teaching them to do what we say, not what we do. A similar challenge has hit artificial intelligence. As more apps and software use AI to automate tasks, a popular data-backed model, called "word embedding," has also picked up entrenched social biases.


Here's What Machine Translation Researchers Are Geeking Out On Slator

#artificialintelligence

Cornell University's automated online distribution system for research papers, Arxiv.org, is a prolific source for anyone interested in staying up to date on progress in neural machine translation (NMT). It has been almost a year from when we first wrote about the dramatic acceleration of academic NMT research as reflected on the number of papers submitted to Arxiv, and the upward trend continues. To understand where current research is heading, we reviewed NMT-related papers within the research repository for the first six weeks of 2018 as well as the last couple of months of the previous year. From November 1, 2017 to February 14, 2018, there were 58 relevant papers. Twelve of those papers are not directly about NMT specifically, but were focused on either machine learning via neural networks in general or adjacent technology such as natural language processing.


$A^{4}NT$: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation

arXiv.org Machine Learning

Text-based analysis methods allow to reveal privacy relevant author attributes such as gender, age and identify of the text's author. Such methods can compromise the privacy of an anonymous author even when the author tries to remove privacy sensitive content. In this paper, we propose an automatic method, called Adversarial Author Attribute Anonymity Neural Translation ($A^4NT$), to combat such text-based adversaries. We combine sequence-to-sequence language models used in machine translation and generative adversarial networks to obfuscate author attributes. Unlike machine translation techniques which need paired data, our method can be trained on unpaired corpora of text containing different authors. Importantly, we propose and evaluate techniques to impose constraints on our $A^4NT$ to preserve the semantics of the input text. $A^4NT$ learns to make minimal changes to the input text to successfully fool author attribute classifiers, while aiming to maintain the meaning of the input. We show through experiments on two different datasets and three settings that our proposed method is effective in fooling the author attribute classifiers and thereby improving the anonymity of authors.