Deep Learning
Loss Max-Pooling for Semantic Image Segmentation
Bulò, Samuel Rota, Neuhold, Gerhard, Kontschieder, Peter
We introduce a novel loss max-pooling concept for handling imbalanced training data distributions, applicable as alternative loss layer in the context of deep neural networks for semantic image segmentation. Most real-world semantic segmentation datasets exhibit long tail distributions with few object categories comprising the majority of data and consequently biasing the classifiers towards them. Our method adaptively re-weights the contributions of each pixel based on their observed losses, targeting under-performing classification results as often encountered for under-represented object classes. Our approach goes beyond conventional cost-sensitive learning attempts through adaptive considerations that allow us to indirectly address both, inter- and intra-class imbalances. We provide a theoretical justification of our approach, complementary to experimental analyses on benchmark datasets. In our experiments on the Cityscapes and Pascal VOC 2012 segmentation datasets we find consistently improved results, demonstrating the efficacy of our approach.
Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction
Ma, Xiaolei, Dai, Zhuang, He, Zhengbing, Na, Jihui, Wang, Yong, Wang, Yunpeng
Tel.: 86-10-5168-8514 Academic Editor: Simon X. Yang Received: 30 January 2017; Accepted: 7 April 2017; Published: date Abstract: This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and northeast transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks. Keywords: transportation network; traffic speed prediction; spatiotemporal feature; deep learning; convolutional neural network 1. Introduction Predicting the future is one of the most attractive topics for human beings, and the same is true for transportation management. Understanding traffic evolution for the entire road network rather than on a single road is of great interest and importance to help people with complete traffic information in make better route choices and to support traffic managers in managing a road network and allocate resources systematically [1,2]. However, large-scale network traffic prediction requires more challenging abilities for prediction models, such as the ability to deal with higher computational complexity incurred by the network topology, the ability to form a more intelligent and efficient prediction to solve the spatial correlation of traffic in roads expanding on a two-dimensional plane, and the ability to forecast longer-term futures to reflect congestion propagation. Thus, existing models may fail to predict largescale network traffic evolution. In the existing literature, two families of research methods have dominated studies in traffic forecasting: statistical methods and neural networks [3]. Statistical techniques are widely used in traffic prediction.
CERN: Confidence-Energy Recurrent Network for Group Activity Recognition
Shu, Tianmin, Todorovic, Sinisa, Zhu, Song-Chun
This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two key contributions giving the name to our approach as Confidence-Energy Recurrent Network -- CERN. First, instead of using the common softmax layer for prediction, we specify a novel energy layer (EL) for estimating the energy of our predictions. Second, rather than finding the common minimum-energy class assignment, which may be numerically unstable under uncertainty, we specify that the EL additionally computes the p-values of the solutions, and in this way estimates the most confident energy minimum. The evaluation on the Collective Activity and Volleyball datasets demonstrates: (i) advantages of our two contributions relative to the common softmax and energy-minimization formulations and (ii) a superior performance relative to the state-of-the-art approaches.
Field of Groves: An Energy-Efficient Random Forest
Takhirov, Zafar, Wang, Joseph, Louis, Marcia S., Saligrama, Venkatesh, Joshi, Ajay
Machine Learning (ML) algorithms, like Convolutional Neural Networks (CNN), Support Vector Machines (SVM), etc. have become widespread and can achieve high statistical performance. However their accuracy decreases significantly in energy-constrained mobile and embedded systems space, where all computations need to be completed under a tight energy budget. In this work, we present a field of groves (FoG) implementation of random forests (RF) that achieves an accuracy comparable to CNNs and SVMs under tight energy budgets. Evaluation of the FoG shows that at comparable accuracy it consumes ~1.48x, ~24x, ~2.5x, and ~34.7x lower energy per classification compared to conventional RF, SVM_RBF , MLP, and CNN, respectively. FoG is ~6.5x less energy efficient than SVM_LR, but achieves 18% higher accuracy on average across all considered datasets.
NIPS 2016 Workshop on Representation Learning in Artificial and Biological Neural Networks (MLINI 2016)
Wehbe, Leila, Nunez-Elizalde, Anwar, van Gerven, Marcel, Rish, Irina, Murphy, Brian, Grosse-Wentrup, Moritz, Langs, Georg, Cecchi, Guillermo
This workshop explores the interface between cognitive neuroscience and recent advances in AI fields that aim to reproduce human performance such as natural language processing and computer vision, and specifically deep learning approaches to such problems. When studying the cognitive capabilities of the brain, scientists follow a system identification approach in which they present different stimuli to the subjects and try to model the response that different brain areas have of that stimulus. The goal is to understand the brain by trying to find the function that expresses the activity of brain areas in terms of different properties of the stimulus. Experimental stimuli are becoming increasingly complex with more and more people being interested in studying real life phenomena such as the perception of natural images or natural sentences. There is therefore a need for a rich and adequate vector representation of the properties of the stimulus, that we can obtain using advances in machine learning. In parallel, new ML approaches, many of which in deep learning, are inspired to a certain extent by human behavior or biological principles. Neural networks for example were originally inspired by biological neurons. More recently, processes such as attention are being used which have are inspired by human behavior. However, the large bulk of these methods are independent of findings about brain function, and it is unclear whether it is at all beneficial for machine learning to try to emulate brain function in order to achieve the same tasks that the brain achieves.
DeepCare: A Deep Dynamic Memory Model for Predictive Medicine
Pham, Trang, Tran, Truyen, Phung, Dinh, Venkatesh, Svetha
Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, recorded in electronic medical records, are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors in space, models patient health state trajectories through explicit memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces time parameterizations to handle irregular timed events by moderating the forgetting and consolidation of memory cells. DeepCare also incorporates medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden -- diabetes and mental health -- the results show improved modeling and risk prediction accuracy.
Intent On Winning Arms Race, Alphabet Inc Continues To Gobble Up A.I. Specialists
With a total of eleven acquisitions in the past four years alone, Google parent Alphabet Inc (NASDAQ:GOOGL) is clearly the most aggressive company in terms of buying up artificial intelligence (AI) firms. That's according to market researchers are CB Insights, who recently published a study recapping all the major acquisitions in the space since 2012: In 2013, Google picked up deep learning and neural network startup DNNresearch from the computer science department at the University of Toronto. This acquisition reportedly helped Google make major upgrades to its image search feature. In 2014 Google acquired British company DeepMind Technologies for some $600M (Google's DeepMind program recently beat a human world champion in the board game "Go"). Last year, it acquired visual search startup Moodstock, and bot platform Api.ai.
Your life in AI's hands: The battle to understand deep learning - TechRepublic
As society enters an era where AI will take life-or-death decisions--spotting whether moles are cancerous and driving us to work--trusting these machines will become ever more important. The difficulty is that it's almost impossible for us to understand the inner workings of many modern AI systems that perform human-like tasks, such as recognizing real-life objects or understanding speech. The models produced by the deep-learning systems that have powered recent AI breakthroughs are largely opaque, functioning as black boxes that spit out a result but whose operation remains mysterious. This inscrutability stems from the complexity of the large neural networks that underpin deep-learning systems. These brain-inspired networks are interconnected layers of algorithms that feed data into each other and can be trained to carry out specific tasks.
Player Piano, or How Digital Marketers Can Survive the Advent of AI - State of Digital
This post is the written version of the talk I presented at The Inbounder World Tour Madrid on March 17. Who knows me or simply follows me on social networks, knows that I am a science fiction geek, which is not surprising in a man who in his childhood saw "Star Wars" in the movie theater or the original series of "Battlestar Galactica" on television. Robots are ones of the main characters in sci-fi tv series, movies and novels. These mechanical beings with human or superhuman intelligence have always fascinated us. If we look only at the history of cinema, we can find some naive robots, like the Tin Man of "The Wizard of Oz", or subtly dangerous, like Ava in "Ex-Machina", or more humans than humans, like the mythical Roy Batty of "Blade Runner".
SAS chief data scientist says that we've only built 'weak AI' – so far
TORONTO – When we see artificial intelligence (AI) in fiction, it usually encompasses the AI functioning just like a human. That's called'strong AI' and well, we aren't there yet. There are two types of AI: strong AI, the aforementioned AI that function as a human would; and weak AI, the type of AI we see today. For example: robotics in a manufacturing plant that function autonomously to complete one task is an example of weak AI. "That's where we are [with weak AI], but I think we are trending towards strong AI," said Wayne Thompson, the chief data scientist at SAS, at the analytics vendor's event in Toronto. "We are trending towards what I consider modern machine learning."