Goto

Collaborating Authors

 Education


12 Best Deep Learning Books In 2018 - Ranked In Order Of Awesomeness!

#artificialintelligence

I'm sure you'll agree that Artificial Intelligence, in particular Deep Learning, has made huge strides in the last 5 years or so. But what began as a relatively niche field with just a handful of researchers, has now become so mainstream that the apps and services that we use everyday now use Deep Learning to perform tasks that were unthinkable not that long ago. It's been around since the 1940s when Warren McCulloch and Walter Pitts created a computational model for neural networks based on mathematics and algorithms. However "Deep Learning" only began to gain in popularity in the mid-2000s when Geoffrey Hinton and Ruslan Salakhutdinov released a paper showed how a multi-layered neural network could be pre-trained one layer at a time. In 2009 it was discovered that with large enough datasets, you didn't actually need the pre-training and that error rates could drop significantly as a result.


How will artificial intelligence affect employment and education?

#artificialintelligence

I'm skeptical of arguments that technology will have severe detrimental effects on employment for many reasons. But one reason is this: If artificial intelligence (AI) turns out to be as powerful as the worriers say, won't it be good at finding new nonobvious tasks for humans and also training them for these new occupations? An artificial Intelligence project utilizing a humanoid robot from French company Aldebaran and reprogramed for their specific campus makes its debut as an assistant for students attending Palomar College in San Marcos, California, REUTERS. Long before we cross such a science fiction threshold, however, we are beginning to see how technology will improve employment opportunities. For example, in the latest in a long series of reports on the topic,Michael Mandel shows yet again how technology usually helps workers.


Multi-Instance Dynamic Ordinal Random Fields for Weakly-supervised Facial Behavior Analysis

arXiv.org Artificial Intelligence

We propose a Multi-Instance-Learning (MIL) approach for weakly-supervised learning problems, where a training set is formed by bags (sets of feature vectors or instances) and only labels at bag-level are provided. Specifically, we consider the Multi-Instance Dynamic-Ordinal-Regression (MI-DOR) setting, where the instance labels are naturally represented as ordinal variables and bags are structured as temporal sequences. To this end, we propose Multi-Instance Dynamic Ordinal Random Fields (MI-DORF). In this framework, we treat instance-labels as temporally-dependent latent variables in an Undirected Graphical Model. Different MIL assumptions are modelled via newly introduced high-order potentials relating bag and instance-labels within the energy function of the model. We also extend our framework to address the Partially-Observed MI-DOR problems, where a subset of instance labels are available during training. We show on the tasks of weakly-supervised facial behavior analysis, Facial Action Unit (DISFA dataset) and Pain (UNBC dataset) Intensity estimation, that the proposed framework outperforms alternative learning approaches. Furthermore, we show that MIDORF can be employed to reduce the data annotation efforts in this context by large-scale.


Learning Flexible and Reusable Locomotion Primitives for a Microrobot

arXiv.org Machine Learning

The design of gaits for robot locomotion can be a daunting process which requires significant expert knowledge and engineering. This process is even more challenging for robots that do not have an accurate physical model, such as compliant or micro-scale robots. Data-driven gait optimization provides an automated alternative to analytical gait design. In this paper, we propose a novel approach to efficiently learn a wide range of locomotion tasks with walking robots. This approach formalizes locomotion as a contextual policy search task to collect data, and subsequently uses that data to learn multi-objective locomotion primitives that can be used for planning. As a proof-of-concept we consider a simulated hexapod modeled after a recently developed microrobot, and we thoroughly evaluate the performance of this microrobot on different tasks and gaits. Our results validate the proposed controller and learning scheme on single and multi-objective locomotion tasks. Moreover, the experimental simulations show that without any prior knowledge about the robot used (e.g., dynamics model), our approach is capable of learning locomotion primitives within 250 trials and subsequently using them to successfully navigate through a maze.


Deep Learning for Causal Inference

arXiv.org Machine Learning

In this paper, we propose deep learning techniques for econometrics, specifically for causal inference and for estimating individual as well as average treatment effects. The contribution of this paper is twofold: 1. For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local neighborhood structure among the data points in the embedding space. This deep learning based technique is shown to perform better than simple k nearest neighbor matching for estimating treatment effects, especially when the data points have several features/covariates but reside in a low dimensional manifold in high dimensional space. We also observe better performance than manifold learning methods for neighbor matching. 2. Propensity score matching is one specific and popular way to perform matching in order to estimate average and individual treatment effects. We propose the use of deep neural networks (DNNs) for propensity score matching, and present a network called PropensityNet for this. This is a generalization of the logistic regression technique traditionally used to estimate propensity scores and we show empirically that DNNs perform better than logistic regression at propensity score matching. Code for both methods will be made available shortly on Github at: https://github.com/vikas84bf


Learning by Playing - Solving Sparse Reward Tasks from Scratch

arXiv.org Machine Learning

We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm in the context of Reinforcement Learning (RL). SAC-X enables learning of complex behaviors - from scratch - in the presence of multiple sparse reward signals. To this end, the agent is equipped with a set of general auxiliary tasks, that it attempts to learn simultaneously via off-policy RL. The key idea behind our method is that active (learned) scheduling and execution of auxiliary policies allows the agent to efficiently explore its environment - enabling it to excel at sparse reward RL. Our experiments in several challenging robotic manipulation settings demonstrate the power of our approach. A video of the rich set of learned behaviours can be found at https://youtu.be/mPKyvocNe M.


Decision functions from supervised machine learning algorithms as collective variables for accelerating molecular simulations

arXiv.org Machine Learning

Selection of appropriate collective variables for enhancing molecular simulations remains an unsolved problem in computational biophysics. In particular, picking initial collective variables (CVs) is particularly challenging in higher dimensions. Which atomic coordinates or transforms there of from a list of thousands should one pick for enhanced sampling runs? How does a modeler even begin to pick starting coordinates for investigation? This remains true even in the case of simple two state systems and only increases in difficulty for multi-state systems. In this work, we attempt to solve the initial CV problem using a data-driven approach inspired by supervised machine learning literature. In particular, we show how the decision functions in supervised machine learning (SML) algorithms can be used as initial CVs for accelerated sampling. Using solvated alanine dipeptide and Chignolin mini-protein as our test cases, we illustrate how the distance to the Support Vector Machines decision hyperplane, the output probability estimates from Logistic Regression, and other classifiers may be used to reversibly sample slow structural transitions. We discuss the utility of other SML algorithms that might be useful for identifying CVs for accelerating molecular simulations.


Learning Discriminative Multilevel Structured Dictionaries for Supervised Image Classification

arXiv.org Machine Learning

PARSE representations have become popular in several applications of signal, image and video processing, such as denoising [1], [2], super-resolution, inpainting, compression [3]-[6] or classification. While it was common to analyze and reconstruct signals based on representations over predefined bases such as wavelets and DCT, research in the recent years has shown that learning overcomplete dictionaries adapted to the structure of the treated signals can significantly improve the representation quality. Observing that learning redundant dictionaries from collections of data samples under sparsity priors leads to models that fit and approximate well the characteristics of signals [7], [8], the learning of dictionaries in a supervised setting for the discrimination of different classes of signals has also become a popular research problem [9]. In this work, we propose a method to learn multilevel structured dictionaries with high discrimination capability for the problem of pixelwise image classification. We consider a supervised classification setting where the classes are known and exemplars are available for each class. In particular, we are interested in image classification problems with a large amount of variability between data samples of the same class, resulting from e.g., dominant presence


Deep Private-Feature Extraction

arXiv.org Machine Learning

We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features.


Does mitigating ML's impact disparity require treatment disparity?

arXiv.org Machine Learning

Following related work in law and policy, two notions of disparity have come to shape the study of fairness in algorithmic decision-making. Algorithms exhibit treatment disparity if they formally treat members of protected subgroups differently; algorithms exhibit impact disparity when outcomes differ across subgroups, even if the correlation arises unintentionally. Naturally, we can achieve impact parity through purposeful treatment disparity. In one thread of technical work, papers aim to reconcile the two forms of parity proposing disparate learning processes (DLPs). Here, the learning algorithm can see group membership during training but produce a classifier that is group-blind at test time. In this paper, we show theoretically that: (i) When other features correlate to group membership, DLPs will (indirectly) implement treatment disparity, undermining the policy desiderata they are designed to address; (ii) When group membership is partly revealed by other features, DLPs induce within-class discrimination; and (iii) In general, DLPs provide a suboptimal trade-off between accuracy and impact parity. Based on our technical analysis, we argue that transparent treatment disparity is preferable to occluded methods for achieving impact parity. Experimental results on several real-world datasets highlight the practical consequences of applying DLPs vs. per-group thresholds.