Goto

Collaborating Authors

 Machine Translation


Unsupervised Machine Translation Using Monolingual Corpora Only

arXiv.org Artificial Intelligence

Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. There have been numerous attempts to extend these successes to low-resource language pairs, yet requiring tens of thousands of parallel sentences. In this work, we take this research direction to the extreme and investigate whether it is possible to learn to translate even without any parallel data. We propose a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space. By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data. We demonstrate our model on two widely used datasets and two language pairs, reporting BLEU scores of 32.8 and 15.1 on the Multi30k and WMT English-French datasets, without using even a single parallel sentence at training time.


Choosing the Right Metric for Evaluating ML Models -- Part 1

@machinelearnbot

In the first blog, we will cover metrics in regression only. Most of the blogs have focussed on classification metrics like precision, recall, AUC etc. For a change, I wanted to explore all kinds of metrics including those used in regression as well. MAE and RMSE are the two most popular metrics for continuous variables. Let's start with the more popular one.


Stanford's NLP Course Projects are Available Online and they're Super Impressive

@machinelearnbot

Stanford has long been considered one of the best universities in terms of teaching, quality of faculty and the content they teach. With the recent boom in the machine learning field, Stanford's ML courses have generated a lot of interest (you can find videos on YouTube if you haven't done so already). Each year, Stanford releases a list of projects that it's students have worked on and recently, in that same regard, has released a list of course projects for it's Natural Language Processing (NLP) course. And wow, is it impressive. Students were given two options for the project โ€“ either choose your own topic (called'Custom Project') or take part in the'Default Project', which was building Question Answering models based on the SQuAD challenge.


Asia is the next frontier for AI development - Asia News Center

#artificialintelligence

This article was originally posted on LinkedIn. In a few short years, Artificial Intelligence (AI) has been thrust into the limelight โ€“ elevating itself from a far-fetched, science-fiction topic to one that is currently dominating my conversations with customers, partners and industry leaders across Asia. The journey to where we are today with AI is a long one โ€“ almost seven decades in the making. However, in the last few years, the convergence of big data, ubiquitous and powerful cloud computing, along with breakthroughs in software algorithms and machine learning have made exciting new scenarios in AI deployment a possibility. AI today is at the center of the digital transformation of organizations and even nations.


The Amazing Ways Google Uses Artificial Intelligence And Satellite Data To Prevent Illegal Fishing

#artificialintelligence

Google services such as its image search and translation tools use sophisticated machine learning which allow computers to see, listen and speak in much the same way as human do. Machine learning is the term for the current cutting-edge applications in artificial intelligence. Basically, the idea is that by teaching machines to "learn" by processing huge amounts of data they will become increasingly better at carrying out tasks that traditionally can only be completed by human brains. These techniques include "computer vision" โ€“ training computers to recognize images in a similar way we do. For example, an object with four legs and a tail has a high probability of being an animal.


Machine Learning With Deeplearning4j and Eclipse Scout - DZone AI

#artificialintelligence

Machine learning and deep learning, in particular, are developing at amazing speeds. Today, machine learning can be used to solve ever more complex tasks that have been considered impractical just a few years ago. Examples include autonomous cars, AlphaGo's win against the world's Go champion, the photo-realistic transformation of pictures, and neural machine translation systems. In this blog post, we describe a simple system to recognize monetary amounts on Swiss payment slips. The user interface is implemented using Eclipse Scout and we build, train, and run the deep neural net using Deeplearning4j.


Statistical Machine Translation Is a Natural Fit for Automatic Identifier Renaming in Software Source Code

AAAI Conferences

Advances in natural language processing have led to a variety of successful tools and techniques for solving problems such as understanding, generating, and translating natural languages. Given the success of these techniques, a natural question is whether they can also be applied to programming languages. However, the initial research has been mixed. Researchers attempting to translate between programming languages by employing statistical machine translation (SMT) found that a large percentage of the translated programs were not syntactically valid. On the other hand, SMT has been successfully employed to recover identifiers in obfuscated JavaScript code. In this paper, we discuss several differences between natural languages and programming languages that can thwart successful application of NLP techniques to program transformation. We also discuss several strategies to cope with these differences in practice, using our own experiences with using SMT to assign meaningful identifier names to variables in decompiled C programs as an example.


Mix and match networks: encoder-decoder alignment for zero-pair image translation

arXiv.org Machine Learning

We address the problem of image translation between domains or modalities for which no direct paired data is available (i.e. zero-pair translation). We propose mix and match networks, based on multiple encoders and decoders aligned in such a way that other encoder-decoder pairs can be composed at test time to perform unseen image translation tasks between domains or modalities for which explicit paired samples were not seen during training. We study the impact of autoencoders, side information and losses in improving the alignment and transferability of trained pairwise translation models to unseen translations. We show our approach is scalable and can perform colorization and style transfer between unseen combinations of domains. We evaluate our system in a challenging cross-modal setting where semantic segmentation is estimated from depth images, without explicit access to any depth-semantic segmentation training pairs. Our model outperforms baselines based on pix2pix and CycleGAN models.


Can Microsoft get smarter? Inside the tech giant's massive bet on AI

#artificialintelligence

Microsoft has so far released its artificial intelligence technologies largely through its well-known software platforms, such as the Cortana voice assistant on Windows 10, automated language translation in Microsoft Office, and AI-powered speech, vision, search and language technologies for developers on Microsoft Azure. Artificial intelligence specialists at the company are now working closely with its devices group, said Harry Shum, the executive vice president of Microsoft's AI and Research group, in a broader interview with GeekWire about the next phase of the company's AI initiatives. Without giving details, Shum said he expects some "very, very exciting devices" to result from the work by the company's AI engineers and devices group. Shum mentioned this as an aside, not to get the gadget blogs buzzing but to underscore the scope of what Microsoft is trying to do. As part of the massive engineering reorganization announced by CEO Satya Nadella last week, the company is attempting to bring artificial intelligence into everything it does.


10 Machine Learning Algorithms You Should Know to Become a Data Scientist - DZone AI

#artificialintelligence

Let's say I am given an Excel sheet with data about various fruits and I have to tell which look like Apples. What I will do is ask a question "Which fruits are red and round?" and divide all fruits which answer yes and no to the question. Now, All Red and Round fruits might not be apples and all apples won't be red and round. So I will ask a question "Which fruits have red or yellow color hints on them? " on red and round fruits and will ask "Which fruits are green and round?" on not red and round fruits. Based on these questions I can tell with considerable accuracy which are apples. This cascade of questions is what a decision tree is. However, this is a decision tree based on my intuition.