Markov Models
Anchor & Transform: Learning Sparse Representations of Discrete Objects
Liang, Paul Pu, Zaheer, Manzil, Wang, Yuan, Ahmed, Amr
Learning continuous representations of discrete objects such as text, users, and URLs lies at the heart of many applications including language and user modeling. When using discrete objects as input to neural networks, we often ignore the underlying structures (e.g. natural groupings and similarities) and embed the objects independently into individual vectors. As a result, existing methods do not scale to large vocabulary sizes. In this paper, we design a Bayesian nonparametric prior for embeddings that encourages sparsity and leverages natural groupings among objects. We derive an approximate inference algorithm based on Small Variance Asymptotics which yields a simple and natural algorithm for learning a small set of anchor embeddings and a sparse transformation matrix. We call our method Anchor & Transform (ANT) as the embeddings of discrete objects are a sparse linear combination of the anchors, weighted according to the transformation matrix. ANT is scalable, flexible, end-to-end trainable, and allows the user to incorporate domain knowledge about object relationships. On text classification and language modeling benchmarks, ANT demonstrates stronger performance with fewer parameters as compared to existing compression baselines.
TTDM: A Travel Time Difference Model for Next Location Prediction
Liu, Qingjie, Zuo, Yixuan, Yu, Xiaohui, Chen, Meng
Next location prediction is of great importance for many location-based applications and provides essential intelligence to business and governments. In existing studies, a common approach to next location prediction is to learn the sequential transitions with massive historical trajectories based on conditional probability. Unfortunately, due to the time and space complexity, these methods (e.g., Markov models) only use the just passed locations to predict next locations, without considering all the passed locations in the trajectory. In this paper, we seek to enhance the prediction performance by considering the travel time from all the passed locations in the query trajectory to a candidate next location. In particular, we propose a novel method, called Travel Time Difference Model (TTDM), which exploits the difference between the shortest travel time and the actual travel time to predict next locations. Further, we integrate the TTDM with a Markov model via a linear interpolation to yield a joint model, which computes the probability of reaching each possible next location and returns the top-rankings as results. We have conducted extensive experiments on two real datasets: the vehicle passage record (VPR) data and the taxi trajectory data. The experimental results demonstrate significant improvements in prediction accuracy over existing solutions. For example, compared with the Markov model, the top-1 accuracy improves by 40% on the VPR data and by 15.6% on the Taxi data.
Towards Transparent Robotic Planning via Contrastive Explanations
Chen, Shenghui, Boggess, Kayla, Feng, Lu
Providing explanations of chosen robotic actions can help to increase the transparency of robotic planning and improve users' trust. Social sciences suggest that the best explanations are contrastive, explaining not just why one action is taken, but why one action is taken instead of another. We formalize the notion of contrastive explanations for robotic planning policies based on Markov decision processes, drawing on insights from the social sciences. We present methods for the automated generation of contrastive explanations with three key factors: selectiveness, constrictiveness, and responsibility. The results of a user study with 100 participants on the Amazon Mechanical Turk platform show that our generated contrastive explanations can help to increase users' understanding and trust of robotic planning policies while reducing users' cognitive burden.
Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human Motion Prediction
Li, Maosen, Chen, Siheng, Zhao, Yangheng, Zhang, Ya, Wang, Yanfeng, Tian, Qi
We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions. The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning. This multiscale graph is adaptive during training and dynamic across network layers. Based on this graph, we propose a multiscale graph computational unit (MGCU) to extract features at individual scales and fuse features across scales. The entire model is action-category-agnostic and follows an encoder-decoder framework. The encoder consists of a sequence of MGCUs to learn motion features. The decoder uses a proposed graph-based gate recurrent unit to generate future poses. Extensive experiments show that the proposed DMGNN outperforms state-of-the-art methods in both short and long-term predictions on the datasets of Human 3.6M and CMU Mocap. We further investigate the learned multiscale graphs for the interpretability. The codes could be downloaded from https://github.com/limaosen0/DMGNN.
Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis
Khamaru, Koulik, Pananjady, Ashwin, Ruan, Feng, Wainwright, Martin J., Jordan, Michael I.
We address the problem of policy evaluation in discounted Markov decision processes, and provide instance-dependent guarantees on the $\ell_\infty$-error under a generative model. We establish both asymptotic and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms. Theory-inspired simulations show that the widely-used temporal difference (TD) algorithm is strictly suboptimal when evaluated in a non-asymptotic setting, even when combined with Polyak-Ruppert iterate averaging. We remedy this issue by introducing and analyzing variance-reduced forms of stochastic approximation, showing that they achieve non-asymptotic, instance-dependent optimality up to logarithmic factors.
NLPMM: a Next Location Predictor with Markov Modeling
Chen, Meng, Liu, Yang, Yu, Xiaohui
In this paper, we solve the problem of predicting the next locations of the moving objects with a historical dataset of trajectories. We present a Next Location Predictor with Markov Modeling (NLPMM) which has the following advantages: (1) it considers both individual and collective movement patterns in making prediction, (2) it is effective even when the trajectory data is sparse, (3) it considers the time factor and builds models that are suited to different time periods. We have conducted extensive experiments in a real dataset, and the results demonstrate the superiority of NLPMM over existing methods.
Multi-AI competing and winning against humans in iterated Rock-Paper-Scissors game
Wang, Lei, Huang, Wenbing, Li, Yuanpeng, Evans, Julian, He, Sailing
Predicting and modeling human behavior and finding trends within human decision-making process is a major social sciences'problem. Rock Paper Scissors (RPS) is the fundamental strategic question in many game theory problems and real-world competitions. Finding the right approach to beat a particular human opponent is challenging. Here we use Markov Chains with set chain lengths as the single AIs (artificial intelligences) to compete against humans in iterated RPS game. This is the first time that an AI algorithm is applied in RPS human competition behavior studies. We developed an architecture of multi-AI with changeable parameters to adapt to different competition strategies. We introduce a parameter "focus length" (an integer of e.g. 5 or 10) to control the speed and sensitivity for our multi-AI to adapt to the opponent's strategy change. We experimented with 52 different people, each playing 300 rounds continuously against one specific multi-AI model, and demonstrated that our strategy could win over more than 95% of human opponents.
Minor Constraint Disturbances for Deep Semi-supervised Learning
Chu, Jielei, Liu, Jing, Wang, Hongjun, Gong, Zhiguo, Li, Tianrui
In high-dimensional data space, semi-supervised feature learning based on Euclidean distance shows instability under a broad set of conditions. Furthermore, the scarcity and high cost of labels prompt us to explore new semi-supervised learning methods with the fewest labels. In this paper, we develop a novel Minor Constraint Disturbances-based Deep Semi-supervised Feature Learning framework (MCD-DSFL) from the perspective of probability distribution for feature representation. There are two fundamental modules in the proposed framework: one is a Minor Constraint Disturbances-based restricted Boltzmann machine with Gaussian visible units (MCDGRBM) for modelling continuous data and the other is a Minor Constraint Disturbances-based restricted Boltzmann machine (MCDRBM) for modelling binary data. The Minor Constraint Disturbances (MCD) consist of less instance-level constraints which are produced by only two randomly selected labels from each class. The Kullback-Leibler (KL) divergences of the MCD are fused into the Contrastive Divergence (CD) learning for training the proposed MCDGRBM and MCDRBM models. Then, the probability distributions of hidden layer features are as similar as possible in the same class and they are as dissimilar as possible in the different classes simultaneously. Despite the weak influence of the MCD for our shallow models (MCDGRBM and MCDRBM), the proposed deep MCD-DSFL framework improves the representation capability significantly under its leverage effect. The semi-supervised strategy based on the KL divergence of the MCD significantly reduces the reliance on the labels and improves the stability of the semi-supervised feature learning in high-dimensional space simultaneously.
Online Guest Detection in a Smart Home using Pervasive Sensors and Probabilistic Reasoning
Renoux, Jennifer, Köckemann, Uwe, Loutfi, Amy
Smart home environments equipped with distributed sensor networks are capable of helping people by providing services related to health, emergency detection or daily routine management. A backbone to these systems relies often on the system's ability to track and detect activities performed by the users in their home. Despite the continuous progress in the area of activity recognition in smart homes, many systems make a strong underlying assumption that the number of occupants in the home at any given moment of time is always known. Estimating the number of persons in a Smart Home at each time step remains a challenge nowadays. Indeed, unlike most (crowd) counting solution which are based on computer vision techniques, the sensors considered in a Smart Home are often very simple and do not offer individually a good overview of the situation. The data gathered needs therefore to be fused in order to infer useful information. This paper aims at addressing this challenge and presents a probabilistic approach able to estimate the number of persons in the environment at each time step. This approach works in two steps: first, an estimate of the number of persons present in the environment is done using a Constraint Satisfaction Problem solver, based on the topology of the sensor network and the sensor activation pattern at this time point. Then, a Hidden Markov Model refines this estimate by considering the uncertainty related to the sensors. Using both simulated and real data, our method has been tested and validated on two smart homes of different sizes and configuration and demonstrates the ability to accurately estimate the number of inhabitants.
Regret Bound of Adaptive Control in Linear Quadratic Gaussian (LQG) Systems
Lale, Sahin, Azizzadenesheli, Kamyar, Hassibi, Babak, Anandkumar, Anima
One of the core challenges in the field of control theory and reinforcement learning is adaptive control. It is the problem of controlling dynamical systems when the dynamics of the systems are unknown to the decision-making agents. In adaptive control, agents interact with given systems in order to explore and control them while the long-term objective is to minimize the overall average associated costs. The agent has to balance between exploration and exploitation, learn the dynamics, strategize for further exploration, and exploit the estimation to minimize the overall costs. The sequential nature of agent-system interaction results in challenges in the system identifying, estimation, and control under uncertainty, and these challenges are magnified when the systems are partially observable, i.e. contain hidden underlying dynamics. In the linear systems, when the underlying dynamics are fully observable, the asymptotic optimality of estimation methods has been the topic of study in the last decades [Lai et al., 1982, Lai and Wei, 1987]. Recently, novel techniques and learning algorithms have been developed to study the finite-time behavior of adaptive control algorithms and shed light on the design of optimal methods [Peña et al., 2009, Fiechter, 1997, Abbasi-Yadkori and Szepesvári, 2011]. In particular, Abbasi-Yadkori and Szepesvári [2011] proposes to use the principle of optimism in the face of uncertainty (OFU) to balance exploration and exploitation in LQR, where the state of the system is observable.