Deep Learning
There's a raging talent war for AI experts and its costing automakers millions
The self-driving car space is getting increasingly more cutthroat. The sheer number of lawsuits filed recently are a testament to that. Tesla, for example, is suing its former Autopilot director Sterling Anderson. The lawsuit claims Anderson stole data for a competing venture, Aurora Innovations, that hasn't even come out of stealth mode yet. "In their zeal to play catch-up, traditional automakers have created a get-rich-quick environment. Small teams of programmers with little more than demoware have been bought for as much as a billion dollars. Cruise Automation, a 40-person firm, was purchased by General Motors in July 2016 for nearly $1 billion. In August 2016, Uber acquired Otto, another self-driving startup that had been founded only seven months earlier, in a deal worth more than $680 million."
Artificial Intelligence will help answer queries automatically: Rajeev Rastogi, Amazon
Rajeev Rastogi, who heads the Machine Learning team at Amazon, explains how the global ecommerce giant employs Artificial Intelligence to improve the online shopping experience. Edited excerpts: In which areas does Amazon use AI? We are applying AI to a number of problems such as speech recognition, natural language understanding, question answering, dialog systems, product recommendations, product search, forecasting future product demand, among others. We have used Deep Learning to do better speech recognition. We use neural networks to convert speech (spoken by users) to text with very high accuracy.
DeepMind AI learns to 'remember' previous knowledge
Much like real synapses, which tend to preserve connections between neurons when they've been useful in the past, the algorithm (known as Elastic Weight Consideration) decides how important a given connection is to its associated task. Ask the neural network to learn a new task and the algorithm will safeguard the most valuable connections, linking them to new tasks when relevant. In tests with 10 classic Atari video games, the AI didn't need learn how to play each game in isolation. It could learn them sequentially, taking the knowledge accrued in one game and applying it to the other. The technology is more than a little rough around the edges.
Sequential Local Learning for Latent Graphical Models
Park, Sejun, Yang, Eunho, Shin, Jinwoo
Sejun Park Eunho Y ang † Jinwoo Shin November 4, 2017 Abstract Learning parameters of latent graphical models (GM) is inherently much harder than that of no-latent ones since the latent variables make the corresponding log-likelihood non-concave. Nevertheless, expectation-maximization schemes are popularly used in practice, but they are typically stuck in local optima. In the recent years, the method of moments have provided a refreshing angle for resolving the non-convex issue, but it is applicable to a quite limited class of latent GMs. In this paper, we aim for enhancing its power via enlarging such a class of latent GMs. To this end, we introduce two novel concepts, coined marginalization and conditioning, which can reduce the problem of learning a larger GM to that of a smaller one. More importantly, they lead to a sequential learning framework that repeatedly increases the learning portion of given latent GM, and thus covers a significantly broader and more complicated class of loopy latent GMs which include convolutional and random regular models. 1 Introduction Graphical models (GM) are succinct representation of a joint distribution on a graph where each node corresponds to a random variable and each edge represents the conditional independence between random variables. GM have been successfully applied for various fields including information theory [12, 19], physics [24] and machine learning [18, 11]. Introducing latent variables to GM has been popular approaches for enhancing their representation powers in recent deep models, e.g., convolutional/restricted/deep Boltzmann machines [20, 27]. Furthermore, they are inevitable in certain scenarios when a part of samples is missing, e.g., see [10]. However, learning parameters of latent GMs is significantly harder than that of no-latent ones since the latent variables make the corresponding negative log-likelihood non-convex.
Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence
Jorge, Emilio, Kågebäck, Mikael, Johansson, Fredrik D., Gustavsson, Emil
Acquiring your first language is an incredible feat and not easily duplicated. Learning to communicate using nothing but a few pictureless books, a corpus, would likely be impossible even for humans. Nevertheless, this is the dominating approach in most natural language processing today. As an alternative, we propose the use of situated interactions between agents as a driving force for communication, and the framework of Deep Recurrent Q-Networks for evolving a shared language grounded in the provided environment. We task the agents with interactive image search in the form of the game Guess Who?. The images from the game provide a non trivial environment for the agents to discuss and a natural grounding for the concepts they decide to encode in their communication. Our experiments show that the agents learn not only to encode physical concepts in their words, i.e. grounding, but also that the agents learn to hold a multi-step dialogue remembering the state of the dialogue from step to step.
Semantic Change Detection with Hypermaps
Suzuki, Teppei, Shirakabe, Soma, Miyashita, Yudai, Nakamura, Akio, Satoh, Yutaka, Kataoka, Hirokatsu
Change detection is the study of detecting changes between two different images of a scene taken at different times. By the detected change areas, however, a human cannot understand how different the two images. Therefore, a semantic understanding is required in the change detection research such as disaster investigation. The paper proposes the concept of semantic change detection, which involves intuitively inserting semantic meaning into detected change areas. We mainly focus on the novel semantic segmentation in addition to a conventional change detection approach. In order to solve this problem and obtain a high-level of performance, we propose an improvement to the hypercolumns representation, hereafter known as hypermaps, which effectively uses convolutional maps obtained from convolutional neural networks (CNNs). We also employ multi-scale feature representation captured by different image patches. We applied our method to the TSUNAMI Panoramic Change Detection dataset, and re-annotated the changed areas of the dataset via semantic classes. The results show that our multi-scale hypermaps provided outstanding performance on the re-annotated TSUNAMI dataset.
Recurrent Orthogonal Networks and Long-Memory Tasks
Henaff, Mikael, Szlam, Arthur, LeCun, Yann
Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions furthermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.
Cloudera to Accelerate Data Science and Machine Learning for the Enterprise with New Data Science Workbench
STRATA HADOOP WORLD SAN JOSE, Calif., March 14, 2017 – Cloudera, the provider of the leading global platform for machine learning and advanced analytics built on the latest open source technologies, today unveiled Cloudera Data Science Workbench, a new self-service environment for data science on Cloudera Enterprise which is currently in beta. Based on the company's acquisition of data science startup Sense.io last year, Data Science Workbench allows data scientists to use their favorite open source languages -- including R, Python, and Scala -- and libraries on a secure enterprise platform with native Apache Spark and Apache Hadoop integration, to accelerate analytics projects from exploration to production. "Cloudera is focused on improving the user experience for data science and engineering teams, in particular those who want to scale their analytics using Spark for data processing and machine learning," said Charles Zedlewski, senior vice president, Products at Cloudera. "The acquisition of Sense.io and its team provided a strong foundation, and Data Science Workbench now puts self-service data science at scale within reach for our customers." Cloudera Data Science Workbench's benefits include: Beyond the extensive Python and R ecosystems, as open data science expands to include deep learning frameworks like Tensorflow, Microsoft Cognitive Toolkit, MXnet, BigDL, and more, data science teams are looking for ways to bring these tools to their data, which is increasingly stored in Hadoop environments Cloudera Data Science Workbench delivers a safe and secure environment to combine the latest open source innovations with the unified platform Cloudera customers trust.
IBM Bridge and Tunnel Investor
IBM unveils Power8 Linux servers for deep learning IBM Linux server for high-performance computing. IBM has launched three Power8 Linux servers designed to accelerate artificial intelligence, deep learning, and advanced analytics applications. The new systems tap the Nvidia NVLink technology to move data five times faster than any competing platform, said Stefanie Chiras, an IBM vice president, in an interview with VentureBeat. These systems and their operating systems are part of a larger business group that generates about $2 billion a quarter for IBM. And the A.I. markets they're going after have exploded in the past couple of years.
Baidu Deep Voice explained: Part 1 -- the Inference Pipeline – Athelas
This post is the first in what I hope to be a series covering recently published ML/AI papers that I think are particularly important. Some of the ideas in these papers are fairly intuitive and I hope I'm able to communicate some of that intuition in this format. For the first paper, I'll be covering Baidu's Deep Voice paper that applies Deep Learning to Text to Speech Systems. Recently, Andrew Ng's Baidu AI Team released an impressive paper on a new Deep Learning based system for converting text to speech. An example of the speech that Baidu's paper is able to produce is shown below.