Deep Learning
Deep LSTM-Based Goal Recognition Models for Open-World Digital Games
Min, Wookhee (North Carolina State University) | Mott, Bradford (North Carolina State University) | Rowe, Jonathan (North Carolina State University) | Lester, James (North Carolina State University)
Player goal recognition in digital games offers the promise of enabling games to dynamically customize player experience. Goal recognition aims to recognize playersโ high-level intentions using a computational model trained on a player behavior corpus. A significant challenge is posed by devising reliable goal recognition models with a behavior corpus characterized by highly idiosyncratic player actions. In this paper, we introduce deep LSTM-based goal recognition models that handle the inherent uncertainty stemming from noisy, non-optimal player behaviors. Empirical evaluation indicates that deep LSTMs outperform competitive baselines including single-layer LSTMs, n-gram encoded feedforward neural networks, and Markov logic networks for a goal recognition corpus collected from an open-world educational game. In addition to metric-based goal recognition model evaluation, we investigate a visualization technique to show a dynamic goal recognition modelโs performance over the course of a playerโs goal-seeking behavior. Deep LSTMs, which are capable of both sequentially and hierarchically extracting salient features of player behaviors, show significant promise as a goal recognition approach for open-world digital games.
Hybrid Activity and Plan Recognition for Video Streams
Granada, Roger Leitzke (Pontifical Catholic University of Rio Grande do Sul) | Pereira, Ramon Fraga (Pontifical Catholic University of Rio Grande do Sul) | Monteiro, Juarez (Pontifical Catholic University of Rio Grande do Sul) | Barros, Rodrigo Coelho (Pontifical Catholic University of Rio Grande do Sul) | Ruiz, Duncan (Pontifical Catholic University of Rio Grande do Sul) | Meneguzzi, Felipe (Pontifical Catholic University of Rio Grande do Sul)
Computer-based human activity recognition of daily living has recently attracted much interest due to its applicability to ambient assisted living. Such applications require the automatic recognition of high-level activities composed of multiple actions performed by human beings in an environment. In this work, we address the problem of activity recognition in an indoor environment, focusing on a kitchen scenario. Unlike existing approaches that identify single actions from video sequences, we also identify the goal towards which the subject of the video is pursuing. Our hybrid approach combines a deep learning architecture to analyze raw video data and identify individual actions which are then processed by a goal recognition algorithm that uses a plan library describing possible overarching activities to identify the ultimate goal of the subject in the video. Experiments show that our approach achieves the state-of-the-art for identifying cooking activities in a kitchen scenario.
A Deep Multi-Task Learning Approach to Skin Lesion Classification
Haofu, Liao (University of Rochester) | Luo, Jiebo (University of Rochester)
However, instead of treating the skin lesion classification Visual aspects of skin diseases, especially skin lesions, play as a standalone problem and training a CNN model a key role in dermatological diagnosis. A successful identification using skin lesion labels only, we further propose to jointly of the skin lesion allows skin disorders to be placed in optimize the skin lesion classification with a related auxiliary certain diagnostic categories where specific diagnosis can be task, body location classification. The motivation behind established (Cecil, Goldman, and Schafer 2012). However, this design is to make use of the body site predilection categorization of skin lesions is a challenging process. It of skin diseases (Cox and Coulson 2004) as it has long usually involves identifying the specific morphology, distribution, been recognized by dermatologists that many skin diseases color, shape and arrangement of lesions. When these and their corresponding skin lesions are correlated with their components are analyzed separately, the differentiation of body site manifestation. For example, a skin lesion caused skin lesions can be quite complex and requires a great deal by sun exposure is only present in sun-exposed areas of the of experience and expertise (Lawrence and Cox 2002).
Scalable Classifiers with ADMM and Transpose Reduction
Taylor, Gavin (United States Naval Academy) | Xu, Zheng (University of Maryland) | Goldstein, Tom (University of Maryland)
As datasets for machine learning grow larger, parallelization strategies become more and more important. Recent approaches to distributed modelfitting rely heavily either on consensus ADMM, where each node solves smallsub-problems using only local data, or on stochastic gradient methods thatdon't scale well to large numbers of cores in a cluster setting. For this reason, GPU clusters have become common prerequisites to large-scale machinelearning. This paper describes an unconventional training method that uses alternating direction methods and Bregman iteration to train a variety of machine learning models on CPUs while avoiding the drawbacks of consensus methods and without gradient descent steps. Using transpose reduction strategies, the proposed method reduces the optimization problems to a sequence of minimization sub-steps that can each be solved globally in closed form. The method provides strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.
Deep Style Match for Complementary Recommendation
Zhao, Kui (Zhejiang University) | Hu, Xia (Hangzhou Science &) | Bu, Jiajun (Technology Information Research Institute) | Wang, Can (Zhejiang University)
Humans develop a common sense of style compatibility between items based on their attributes. We seek to automatically answer questions like "Does this shirt go well with that pair of jeans?" In order to answer these kinds of questions, we attempt to model human sense of style compatibility in this paper. The basic assumption of our approach is that most of the important attributes for a product in an online store are included in its title description. Therefore it is feasible to learn style compatibility from these descriptions. We design a Siamese Convolutional Neural Network architecture and feed it with title pairs of items, which are either compatible or incompatible. Those pairs will be mapped from the original space of symbolic words into some embedded style space. Our approach takes only words as the input with few preprocessing and there is no laborious and expensive feature engineering.
Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network
Yoon, Seunghyun (Seoul National University) | Yun, Hyeongu (Seoul National University) | Kim, Yuna (Samsung Electronics) | Park, Gyu-tae (Samsung Electronics) | Jung, Kyomin (Seoul National University)
In this paper, we propose an efficient transfer leaning methods for training a personalized language model using a recurrent neural network with long short-term memory architecture. With our proposed fast transfer learning schemes, a general language model is updated to a personalized language model with a small amount of user data and a limited computing resource. These methods are especially useful for a mobile device environment while the data is prevented from transferring out of the device for privacy purposes. Through experiments on dialogue data in a drama, it is verified that our transfer learning methods have successfully generated the personalized language model, whose output is more similar to the personal language style in both qualitative and quantitative aspects.
Regularization and Learning an Ensemble of RNNs by Decorrelating Representations
Yadav, Mohit (TCS Research New-Delhi) | Agarwal, Sakshi (IIT Kharagpur )
Recurrent Neural Networks (RNNs) and their variants (suchas LSTMs and GRUs) have been remarkably successful atmachine-learning tasks on diverse kinds of sequential data(e.g. text, time-series, etc.). However, training of RNNs con-tinue to be a challenge due to difficulties stemming from regu-larization and the highly non-convex optimizations involved.In this paper, we propose to regularize training of RNNs byencouraging higher decorrelation in the hidden representa-tions. The cost function is devised to minimize non-diagonalelements of the correlation matrix computed over the hid-den representations of RNNs, along with the usual trainingaccuracy term; thereby penalizing redundancy in the learnedmodel. Furthermore, we propose to utilize the idea of decor-relating representations in learning an ensemble of RNNs,in order to maximize diversity in the resulting models; thusenforcing every individual network of the ensemble to gainabilities that are complementary to the ensemble. Extensiveexperiments are presented on various datasets with differentarchitectures of RNNs. Results are offered for multiple tasksand show that the proposed methods yield a significant im-provement; when compared with the state-of-the-art methods.
DeepForest: 3D Hand Pose Estimation Using Deep Network and Random Forest Regression
Quan, Le Manh (Sejong University) | Yong-Guk, Kim (Sejong University)
Hand pose estimation plays an important role in human-computer interaction and virtual reality. In this paper, we present a regression framework to estimate 3D hand pose using depth image. Different from the previous methods, we propose a new method that has three key aspects: first, performance of system can be improved by setting up the better initial images using feature extraction via Convolution Neural Network (CNN); secondly, the error of joint position is estimated by dividing the dataset into groups of gesture type; thirdly, accuracy can be improved by learning the residual intensity of depth image by updating the residual of 3D joint coordinates constantly. It is noticed that importance of categorizing hand poses by gesture in computing the joint positions has been underestimated. Experimental evaluation with a public dataset A*STAR shows that our method produces low error of hand pose estimation and has more potential for the future work of the hand pose estimation.
Learning from Graph Neighborhoods Using LSTMs
Agrawal, Rakshit (University of California, Santa Cruz) | Alfaro, Luca de (University of California, Santa Cruz) | Polychronopoulos, Vassilis (University of California, Santa Cruz)
Many prediction problems can be phrased as inferences over local neighborhoods of graphs. The graph represents the interaction between entities, and the neighborhood of each entity contains information that allows the inferences or predictions. We present an approach for applying machine learning directly to such graph neighborhoods, yielding predictions for graph nodes on the basis of the structure of their local neighborhood and the features of the nodes in it. Our approach allows predictions to be learned directly from examples, bypassing the step of creating and tuning an inference model or summarizing the neighborhoods via a fixed set of hand-crafted features. The approach is based on a multi-level architecture built from Long Short-Term Memory neural nets (LSTMs); the LSTMs learn how to summarize the neighborhood from data. We demonstrate the effectiveness of the proposed technique on a synthetic example and on real-world data related to crowdsourced grading, Bitcoin transactions, and Wikipedia edit reversions.
Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams
Tuor, Aaron (Western Washington University) | Kaplan, Samuel (Western Washington University) | Hutchinson, Brian (Western Washington University) | Nichols, Nicole (Pacific Northwest National Laboratory) | Robinson, Sean (Pacific Northwest National Laboratory)
Analysis of an organization's computer network activity is a key component of early detection and mitigation of insider threat, a growing concern for many organizations. Raw system logs are a prototypical example of streaming data that can quickly scale beyond the cognitive power of a human analyst. As a prospective filter for the human analyst, we present an online unsupervised deep learning approach to detect anomalous network activity from system logs in real time. Our models decompose anomaly scores into the contributions of individual user behavior features for increased interpretability to aid analysts reviewing potential cases of insider threat. Using the CERT Insider Threat Dataset v6.2 and threat detection recall as our performance metric, our novel deep and recurrent neural network models outperform Principal Component Analysis, Support Vector Machine and Isolation Forest based anomaly detection baselines. For our best model, the events labeled as insider threat activity in our dataset had an average anomaly score in the 95.53 percentile, demonstrating our approach's potential to greatly reduce analyst workloads.