Deep Learning
Continual Learning Through Synaptic Intelligence
Zenke, Friedemann, Poole, Ben, Ganguli, Surya
While deep learning has led to remarkable advances across diverse applications, it struggles in domains where the data distribution changes over the course of learning. In stark contrast, biological neural networks continually adapt to changing domains, possibly by leveraging complex molecular machinery to solve many tasks simultaneously. In this study, we introduce intelligent synapses that bring some of this biological complexity into artificial neural networks. Each synapse accumulates task relevant information over time, and exploits this information to rapidly store new memories without forgetting old ones. We evaluate our approach on continual learning of classification tasks, and show that it dramatically reduces forgetting while maintaining computational efficiency.
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks
Louizos, Christos, Welling, Max
We reinterpret multiplicative noise in neural networks as auxiliary random variables that augment the approximate posterior in a variational setting for Bayesian neural networks. We show that through this interpretation it is both efficient and straightforward to improve the approximation by employing normalizing flows (Rezende & Mohamed, 2015) while still allowing for local reparametrizations (Kingma et al., 2015) and a tractable lower bound (Ranganath et al., 2015; Maaløe et al., 2016). In experiments we show that with this new approximation we can significantly improve upon classical mean field for Bayesian neural networks on both predictive accuracy as well as predictive uncertainty.
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Foerster, Jakob, Nardelli, Nantas, Farquhar, Gregory, Afouras, Triantafyllos, Torr, Philip H. S., Kohli, Pushmeet, Whiteson, Shimon
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
Foerster, Jakob N., Gilmer, Justin, Chorowski, Jan, Sohl-Dickstein, Jascha, Sussillo, David
There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations - in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture's solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.
Deep Learning Chapter 4: Numerical Computation – Towards Data Science – Medium
During out discussion of Chapter 3 we did not get a chance to go over Information Theory part of the chapter, so we asked Yaroslav to give us a quick overview before we dove into Chapter 4. We then proceeded to discuss Numerical Computation. Yaroslav gave us an overview of the chapter with his own slides (please see slides attached below) and then went through Ian Goodfellow's slide deck at the end of the presentation. This is the last chapter before we dive into ML and DL aka the juicy stuff:) So stay tuned for more posts in the near future. We are meeting every Monday at USF Data Institute. Next meeting is on Monday 6/12/17 6:30pm-8:30pm we will be covering Chapter 5: Machine Learning Basics.
Data Science for IoT vs Classic Data Science: 10 Differences
We alluded to the possibility of Deep Learning and IoT previously where we said that Deep learning algorithms play an important role in IoT analytics because Machine data is sparse and / or has a temporal element to it. Devices may behave differently at different conditions. Hence, capturing all scenarios for data pre-processing/training stage of an algorithm is difficult. Deep learning algorithms can help to mitigate these risks by enabling algorithms learn on their own. This concept of machines learning on their own can be extended to machines teaching other machines.
Microsoft Upgrades Windows-Based Data Science Virtual Machine
Data Science Virtual Machine (DSVM), Microsoft's cloud-based offering for big data analytics, is now available in a new preview version based on Windows Server 2016 Datacenter Edition. Previously, the Windows version of DSVM only ran on a Windows Server 2012 image. Microsoft also makes DSVM available in Ubuntu and CentOS Linux flavors. In upgrading to Windows Server 2016, DSVM users now have access to additional tools and functionality, including Docker container support, noted Microsoft software engineer Udayan Kumar in a June 6 announcement. The new virtual machine also comes bundled with Office ProPlus and includes an upgrade to Microsoft R Server 9.1, which now features sentiment analysis and other cognitive models.
LG Electronics sets up AI division: Yonhap - The Mainichi
South Korean tech powerhouse LG Electronics Inc. said Sunday it has set up two research centers to develop technologies related to artificial intelligence, Yonhap News Agency reported. One center will focus on developing AI and the other robotics, it said. The AI center will focus on developing technologies applicable to LG Electronics' home appliance lineup, smartphones and automobile parts, while the robotics center will focus on developing "core technologies of smart robotics," Yonhap reported. A division within the AI center "will be devoted to R&D for deep learning, a new area of AI research where a computer emulates the way the human brain creates patterns from data and processes them," the report said.
Deep Learning Algorithm Rewrites Traditional Recipes for New Regions, Ingredients
Imagine your favorite go-to recipe mutated to conform to the traditional methods and ingredients of any number of diverse regional food cultures. Consider, say, lasagne, but a sort of lasagne that's instead a naturally occurring part of Japanese or Ethiopian cuisine. Not "fusion," but something deeper--a whole rewriting of what a lasagne even is according to the culinary traditions of some other place. It's not necessarily an easy or natural thing to do, but a new machine learning algorithm developed by a team of French, American, and Japanese researchers offers an automated solution based on neural networks and large amounts of food data. The result, which is described in a paper published this month to the arXiv preprint server (via I Programmer), is a system that can take a given recipe and shift it into an alternative dietary style--sushi lasagne, say--as well as parse a recipe for its underlying style components.
So, bots you say… – The AI guys – Medium
Think of it as a magical black box that understands what you tell it with your daily English (or Spanish, or German, or Mandarin, or whatever) and answers you in the same language, and not in a stream of ones and zeros. It's most likely some sort of state machine or automaton, processing the input it received; and of course is not magic (sorry to break the spell): it's one subset of Artificial Intelligence called Natural Language Processing (or NLP). Well, all of the above use something called Natural Language Processing. The Natural Language Processing is a huge field of study, which I'll cover in detail later in the series, but for now, let's move on.