Goto

Collaborating Authors

 Deep Learning


Multi-label Classification using Labels as Hidden Nodes

arXiv.org Machine Learning

Competitive methods for multi-label classification typically invest in learning labels together. To do so in a beneficial way, analysis of label dependence is often seen as a fundamental step, separate and prior to constructing a classifier. Some methods invest up to hundreds of times more computational effort in building dependency models, than training the final classifier itself. We extend some recent discussion in the literature and provide a deeper analysis, namely, developing the view that label dependence is often introduced by an inadequate base classifier, rather than being inherent to the data or underlying concept; showing how even an exhaustive analysis of label dependence may not lead to an optimal classification structure. Viewing labels as additional features (a transformation of the input), we create neural-network inspired novel methods that remove the emphasis of a prior dependency structure. Our methods have an important advantage particular to multi-label data: they leverage labels to create effective units in middle layers, rather than learning these units from scratch in an unsupervised fashion with gradient-based methods. Results are promising. The methods we propose perform competitively, and also have very important qualities of scalability.


Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

arXiv.org Artificial Intelligence

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.


Understanding Deep Neural Networks with Rectified Linear Units

arXiv.org Artificial Intelligence

In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU). We give the first-ever polynomial time (in the size of data) algorithm to train to global optimality a ReLU DNN with one hidden layer, assuming the input dimension and number of nodes of the network as fixed constants. We also improve on the known lower bounds on size (from exponential to super exponential) for approximating a ReLU deep net function by a shallower ReLU net. Our gap theorems hold for smoothly parametrized families of "hard" functions, contrary to countable, discrete families known in the literature. An example consequence of our gap theorems is the following: for every natural number $k$ there exists a function representable by a ReLU DNN with $k^2$ hidden layers and total size $k^3$, such that any ReLU DNN with at most $k$ hidden layers will require at least $\frac{1}{2}k^{k+1}-1$ total nodes. Finally, we construct a family of $\mathbb{R}^n\to \mathbb{R}$ piecewise linear functions for $n\geq 2$ (also smoothly parameterized), whose number of affine pieces scales exponentially with the dimension $n$ at any fixed size and depth. To the best of our knowledge, such a construction with exponential dependence on $n$ has not been achieved by previous families of "hard" functions in the neural nets literature. This construction utilizes the theory of zonotopes from polyhedral theory.


5 Free Resources for Getting Started with Self-driving Vehicles

@machinelearnbot

Recent years have witnessed amazing progress in AI related fields such as computer vision, machine learning and autonomous vehicles. As with any rapidly growing field, however, it becomes increasingly difficult to stay up-to-date or enter the field as a beginner. While several topic specific survey papers have been written, to date no general survey on problems, datasets and methods in computer vision for autonomous vehicles exists. This paper attempts to narrow this gap by providing a state-of-the-art survey on this topic. Our survey includes both the historically most relevant literature as well as the current state-of-the-art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding and end-to-end learning. A lengthy, thorough overview, and probably the best starting place for anyone looking to get up to speed in the field quickly, and in one spot.


This Deep Learning AI Generated Thousands of Creepy Cat Pictures

#artificialintelligence

Cats are some of the most photogenic animals on the planet, except for when those photos are generated by artificial intelligences. Then, the cute and cuddly creatures become creepy, nightmare cats. Case in point is the Meow Generator, a collection of machine learning algorithms that have been unleashing thousands of disturbing cat faces on the world--15,749 of them, to be exact. These creepy cats are the design of Alexia Jolicoeur-Martineau, a data scientist with a passion for machine learning and kitties. "I did this to get more practical experience with deep learning, as I want to apply for a PHD next year," Jolicoeur-Martineau told me.


Police bodycams could spot criminals with real-time artificial intelligence

#artificialintelligence

Police officers could soon be wearing body-mounted cameras programmed to spot criminals and missing people in real-time, using artificial intelligence. The cameras, built by Motorola and similar to those already used by some US police forces to record an officer's point of view, could also help find missing objects like a stolen car, thanks to machine learning. A prototype of the AI camera is already being developed by Motorola and Neurala, a deep learning startup based in Boston, Massachusetts that recently added its software to drone cameras to help track poachers in Africa. The smart camera will learn while it is used and "automatically search for persons or objects of interest, significantly reducing the time and effort required to find a missing child or suspicious object in environments that are often crowded or chaotic," Motorola and Neurala said in a joint statement. "We see powerful potential for artificial intelligence to improve safety and efficiency for our customers, which in turn helps create safer communities," said Paul Steinberg, chief technology officer of Motorola Solutions.


Winning Strategies for Applied AI Companies – Machine Learnings

#artificialintelligence

To give you a little more context -- and paraphrasing Alex's post -- we have entered the third wave of AI startups. The wave of applied AI companies. The first wave was purely research-driven companies, with companies like Deepmind and Nnaissence standing out. Most of them never really commercialized their product and were acquihired before generating revenues . A second wave followed and consisted of companies building machine learning infrastructures.


Police body cams will soon use AI to find missing people

Engadget

Motorola is adding machine learning to its surveillance equipment used by law enforcement personnel. Cops in Chicago's Waukegan police department are already suiting up with the company's Si500 body cams. But those same cameras could soon pack AI that could help officers identify missing people and objects. A prototype device is in the works with Neurala, a deep learning startup that recently integrated its software with drones to track poachers in Africa. In the near future, the camera will be able to recognize images and communicate that data with other Si500s.


Implementing MaLSTM on Kaggle's Quora Question Pairs competition

#artificialintelligence

In the past few years, deep learning is all the fuss in the tech industry. To keep up on things I like to get my hands dirty implementing interesting network architectures I come across in article readings. Few months ago I came across a very nice article called Siamese Recurrent Architectures for Learning Sentence Similarity which offers a pretty straightforward approach at the common problem of sentence similarity. Named MaLSTM ("Ma" for Manhattan distance), its architecture is depicted in figure 1 (diagram excludes the sentence preprocessing part). Notice that since this is a Siamese network, it is easier to train because it shares weights on both sides.


Biologically Inspired Software Architecture for Deep Learning

#artificialintelligence

In the Google paper, the authors enumerate many risk factors, design patterns, and anti-patterns to needs to be taken into consideration in an architecture. These include design patterns such as: boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies and changes in the external world. By contrast, Deep Learning systems (applies equally to machine learning), code is created from training data. A recent paper from the folks at Berkeley are exploring the requirements for building these new kinds of systems (see: "Real-Time Machine Learning: The Missing Pieces").