Goto

Collaborating Authors

 Deep Learning


Memory Augmented Neural Networks with Wormhole Connections

arXiv.org Machine Learning

Recent empirical results on long-term dependency tasks have shown that neural networks augmented with an external memory can learn the long-term dependency tasks more easily and achieve better generalization than vanilla recurrent neural networks (RNN). We suggest that memory augmented neural networks can reduce the effects of vanishing gradients by creating shortcut (or wormhole) connections. Based on this observation, we propose a novel memory augmented neural network model called TARDIS (Temporal Automatic Relation Discovery in Sequences). The controller of TARDIS can store a selective set of embeddings of its own previous hidden states into an external memory and revisit them as and when needed. For TARDIS, memory acts as a storage for wormhole connections to the past to propagate the gradients more effectively and it helps to learn the temporal dependencies. The memory structure of TARDIS has similarities to both Neural Turing Machines (NTM) and Dynamic Neural Turing Machines (D-NTM), but both read and write operations of TARDIS are simpler and more efficient. We use discrete addressing for read/write operations which helps to substantially to reduce the vanishing gradient problem with very long sequences. Read and write operations in TARDIS are tied with a heuristic once the memory becomes full, and this makes the learning problem simpler when compared to NTM or D-NTM type of architectures. We provide a detailed analysis on the gradient propagation in general for MANNs. We evaluate our models on different long-term dependency tasks and report competitive results in all of them.


Double/Debiased/Neyman Machine Learning of Treatment Effects

arXiv.org Machine Learning

Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, and Newey (2016) provide a generic double/debiased machine learning (DML) approach for obtaining valid inferential statements about focal parameters, using Neyman-orthogonal scores and cross-fitting, in settings where nuisance parameters are estimated using a new generation of nonparametric fitting methods for high-dimensional data, called machine learning methods. In this note, we illustrate the application of this method in the context of estimating average treatment effects (ATE) and average treatment effects on the treated (ATTE) using observational data. A more general discussion and references to the existing literature are available in Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, and Newey (2016). Key words: Neyman machine learning, orthogonalization, cross-fitting, double or de-biased machine learning, orthogonal score, efficient score, post-machine-learning and post-regularization inference, random forest, lasso, deep learning, neural nets, boosted trees, efficiency, optimality.


A primer on universal function approximation with deep learning (in Torch and R)

@machinelearnbot

Arthur C. Clarke famously stated that "any sufficiently advanced technology is indistinguishable from magic." No current technology embodies this statement more than neural networks and deep learning. And like any good magic it not only dazzles and inspires but also puts fear into people's hearts. One known property of artificial neural networks (ANNs) is that they are universal function approximators. This means that any mathematical function can be represented by a neural network.


Careers โ€“ Neurala

#artificialintelligence

At Neurala, we tackle the most challenging problems in artificial intelligence, deep learning and robotics.


The Real Potential of AI (hint: it's not robots)

#artificialintelligence

This week Stanford was the center of attention in the artificial intelligence community after it published news that it trained a deep learning model that diagnoses skin cancer as accurately as a dermatologist. The algorithm apparently can identify a cancerous mole with nothing more than a picture, meaning it could be put into the hands of anyone with a simple smartphone -- otherwise known as a pocket supercomputer. Deep learning is revolutionizing the way innovators can apply AI and data science to solve real-world problems. Image classification, facial recognition, computational linguistics, translation, augmented reality, self-driving cars -- all of these fields have made huge leaps in the last several years as computer scientists apply the rapidly-developing machine learning models that empower them. With all the excitement around these developments, one starts to wonderโ€ฆwhat does a future with advanced AI look like?


This Week in Machine Learning, 20 January 2017 โ€“ Udacity Inc

#artificialintelligence

Machine Learning is one of the most exciting fields in the world. Every week we discover something new, something amazing, something revolutionary. It's incredible, but it can also be overwhelming. That's why we created This Week in Machine Learning! Each week we publish a curated list of Machine Learning stories as a resource to help you keep pace with all these exciting developments.


The sound of impending failure

#artificialintelligence

Sound is an incredibly valuable means of communicating information. Most motorists are familiar with the alarming noise of a slipping belt drive. And many other experts can detect problems with common machines in their respective fields just by listening to the sounds they make. If we can find a way to automate listening itself, we would be able to more intelligently monitor our world and its machines day and night. We could predict the failure of engines, rail infrastructure, oil drills and power plants in real time -- notifying humans the moment of an acoustical anomaly.


FPGA-Based AI System Recognizes Faces at 1,000 Images per Second EE Times

#artificialintelligence

There is tremendous potential for facial recognition technology, such as informing visually impaired persons if someone they know is approaching them. I find it difficult to believe just how fast things are moving with regard to using artificial neural networks (ANNs) and deep learning techniques (for example, see Deep learning machine vision system aids blind and visually impaired, Deep learning hits a sweet note, Machine learning platform speeds optimization of vision systems, Unlocking the power of AI for all developers, and Push-button generation of deep neural networks). Of course, one really interesting application is to perform object detection and identification, including the really tricky task of recognizing and identifying faces in images and videos. This sort of task benefits from the extreme parallelism offered by FPGAs. Of particular interest are Intel's current generation of FPGAs, whose hard-core DSP slices offer both fixed-point and floating-point capabilities, making them suitable for a wide range of artificial intelligence (AI) and embedded vision applications.


Tech Leaders Are Just Now Getting Serious About AI Ethics

#artificialintelligence

A kind of ethics fever has taken hold of the AI community. As smart machines displace human jobs and seem poised to make life-or-death decisions in self-driving cars and health care, concerns about where AI is taking us are gaining increasing urgency. Earlier this month, the MIT Media Lab joined with the Harvard Berkman Klein Center for Internet & Society to anchor a $27 million Ethics and Governance of Artificial Intelligence initiative. The fund joins a growing array of AI ethics initiatives crisscrossing the corporate world and academia. In July 2016, leading AI researchers discussed the technologies' social and economic implications at the AI Now symposium in New York City.


Providing the Computational Power for Machine Learning - DZone Big Data

#artificialintelligence

Machine learning has largely been enabled by the coming together of large datasets, algorithms capable of making sense of the data, and affordable computing to underpin everything. It's interesting to see, therefore, that supercomputing giant Cray Inc. have recently undertaken a deep learning collaboration with Microsoft and the Swiss National Supercomputing Centre. The project aimed to improve the ability of companies to run deep learning algorithms at scale. The partnership worked to leverage their collective computing expertise to scale up the Microsoft Cognitive Toolkit onto a Cray XC50 supercomputer. The aim is to speed up the training process, and thus obtain results in hours that would typically take weeks, or even months.