In his best-selling 2011 book Thinking, Fast and Slow, Nobel Prize-winning economist Daniel Kahneman hypothesized that thinking could be broken down into two distinct processes -- aptly named fast and slow thought. The former is all about your gut, the initial automatic responses you have to things, while the later is calculated, reflective and time-consuming. A new algorithm from DeepMind is beginning to show us that so-called "slow" thinking may soon be within the reach of machine learning. In a new paper published in Nature, the Google subsidiary DeepMind explained a new approach to machine learning that uses something called a differentiable neural computer. Neural networks operate using what essentially amounts to a very sophisticated trial and error process, eventually arriving at an answer.
Modern neural network training relies on piece-wise (sub-)differentiable functions in order to use backpropation for efficient calculation of gradients. In this work, we introduce a novel method to allow for non-differentiable functions at intermediary layers of deep neural networks. We do so through the introduction of a differentiable approximation bridge (DAB) neural network which provides smooth approximations to the gradient of the non-differentiable function. We present strong empirical results (performing over 600 experiments) in three different domains: unsupervised (image) representation learning, image classification, and sequence sorting to demonstrate that our proposed method improves state of the art performance. We demonstrate that utilizing non-differentiable functions in unsupervised (image) representation learning improves reconstruction quality and posterior linear separability by 10%. We also observe an accuracy improvement of 77% in neural sequence sorting and a 25% improvement against the straight-through estimator  in an image classification setting with the sort non-linearity. This work enables the usage of functions that were previously not usable in neural networks.
In this talk, we will be discussing PyTorch: a deep learning framework that has fast neural networks that are dynamic in nature. PyTorch is written in a mix of Python and C/C and is targeted for very high performance using GPUs and CPUs. We'll be discussing the design and challenges of PyTorch, as well as the need for the dynamic nature because of new-age AI research. We will also be talking about a Tensor compiler that powers PyTorch, which fuses operations on the fly to make them faster. Soumith Chintala FACEBOOK Soumith is a Research Engineer at Facebook AI Research.
We study the problem of multiset prediction. The goal of multiset prediction is to train a predictor that maps an input to a multiset consisting of multiple items. Unlike existing problems in supervised learning, such as classification, ranking and sequence generation, there is no known order among items in a target multiset, and each item in the multiset may appear more than once, making this problem extremely challenging. In this paper, we propose a novel multiset loss function by viewing this problem from the perspective of sequential decision making. The proposed multiset loss function is empirically evaluated on two families of datasets, one synthetic and the other real, with varying levels of difficulty, against various baseline loss functions including reinforcement learning, sequence, and aggregated distribution matching loss functions. The experiments reveal the effectiveness of the proposed loss function over the others.