Undirected Networks
Unsupervised Co-Activity Detection from Multiple Videos Using Absorbing Markov Chain
Yeo, Donghun (POSTECH) | Han, Bohyung (POSTECH) | Han, Joon Hee (POSTECH)
We propose a simple but effective unsupervised learning algorithm to detect a common activity (co-activity) from a set of videos, which is formulated using absorbing Markov chain in a principled way. In our algorithm, a complete multipartite graph is first constructed, where vertices correspond to subsequences extracted from videos using a temporal sliding window and edges connect between the vertices originated from different videos; the weight of an edge is proportional to the similarity between the features of two end vertices. Then, we extend the graph structure by adding edges between temporally overlapped subsequences in a video to handle variable-length co-activities using temporal locality, and create an absorbing vertex connected from all other nodes. The proposed algorithm identifies a subset of subsequences as co-activity by estimating absorption time in the constructed graph efficiently. The great advantage of our algorithm lies in the properties that it can handle more than two videos naturally and identify multiple instances of a co-activity with variable lengths in a video. Our algorithm is evaluated intensively in a challenging dataset and illustrates outstanding performance quantitatively and qualitatively.
Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks
Ondruska, Peter (University of Oxford) | Posner, Ingmar (University of Oxford)
This paper presents to the best of our knowledge the first end-to-end object tracking approach which directly maps from raw sensor input to object tracks in sensor space without requiring any feature engineering or system identification in the form of plant or sensor models. Specifically, our system accepts a stream of raw sensor data at one end and, in real-time, produces an estimate of the entire environment state at the output including even occluded objects. We achieve this by framing the problem as a deep learning task and exploit sequence models in the form of recurrent neural networks to learn a mapping from sensor measurements to object tracks. In particular, we propose a learning method based on a form of input dropout which allows learning in an unsupervised manner, only based on raw, occluded sensor data without access to ground-truth annotations. We demonstrate our approach using a synthetic dataset designed to mimic the task of tracking objects in 2D laser data — as commonly encountered in robotics applications — and show that it learns to track many dynamic objects despite occlusions and the presence of sensor noise.
Distance Minimization for Reward Learning from Scored Trajectories
Burchfiel, Benjamin (Duke University) | Tomasi, Carlo (Duke University) | Parr, Ronald (Duke University)
Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance Minimization IRL (DM-IRL), for learning reward functions from scores an expert assigns to possibly suboptimal demonstrations. By changing the expert’s role from a demonstrator to a judge, DM-IRL relaxes some of the assumptions present in IRL, enabling learning from the scoring of arbitrary demonstration trajectories with unknown transition functions. DM-IRL complements existing IRL approaches by addressing different assumptions about the expert. We show that DM-IRL is robust to expert scoring error and prove that finding a policy that produces maximally informative trajectories for an expert to score is strongly NP-hard. Experimentally, we demonstrate that the reward function DM-IRL learns from an MDP with an unknown transition model can transfer to an agent with known characteristics in a novel environment, and we achieve successful learning with limited available training data.
RAO*: An Algorithm for Chance-Constrained POMDP's
Santana, Pedro Henrique Rodrigues Quemel e Assis (Massachusetts Institute of Technology) | Thiébaux, Sylvie (The Australian National University and NICTA) | Williams, Brian (Massachusetts Institute of Technology)
Autonomous agents operating in partially observable stochastic environments often face the problem of optimizing expected performance while bounding the risk of violating safety constraints. Such problems can be modeled as chance-constrained POMDP's (CC-POMDP's). Our first contribution is a systematic derivation of execution risk in POMDP domains, which improves upon how chance constraints are handled in the constrained POMDP literature. Second, we present RAO*, a heuristic forward search algorithm producing optimal, deterministic, finite-horizon policies for CC-POMDP's. In addition to the utility heuristic, RAO* leverages an admissible execution risk heuristic to quickly detect and prune overly-risky policy branches. Third, we demonstrate the usefulness of RAO* in two challenging domains of practical interest: power supply restoration and autonomous science agents.
A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs
Chatterjee, Krishnendu (IST Austria) | Chmelík, Martin (IST Austria) | Davies, Jessica (IST Austria)
The qualitative problem is of great importance as in several applications it is The de facto model for dynamic systems with probabilistic required that the correct behavior happens with probability 1, and nondeterministic behavior are Markov decision processes e.g., in the analysis of randomized embedded schedulers, (MDPs) (Howard 1960). MDPs provide the appropriate the important question is whether every thread progresses model to solve control and probabilistic planning problems with probability 1. Also in applications where it might be (Filar and Vrieze 1997; Puterman 1994), where the nondeterminism sufficient that the correct behavior happens with probability represents the choice of the control actions for at least λ 1,the correct choice of the threshold λ can the controller (or planner), while the stochastic response of be still challenging, due to simplifications and imprecisions the system to control actions is represented by the probabilistic introduced during modeling.
Solving Risk-Sensitive POMDPs With and Without Cost Observations
Hou, Ping (New Mexico State University) | Yeoh, William (New Mexico State University) | Varakantham, Pradeep (Singapore Management University)
Partially Observable Markov Decision Processes (POMDPs) are often used to model planning problems under uncertainty. The goal in Risk-Sensitive POMDPs (RS-POMDPs) is to find a policy that maximizes the probability that the cumulative cost is within some user-defined cost threshold. In this paper, unlike existing POMDP literature, we distinguish between the two cases of whether costs can or cannot be observed and show the empirical impact of cost observations. We also introduce a new search-based algorithm to solve RS-POMDPs and show that it is faster and more scalable than existing approaches in two synthetic domains and a taxi domain generated with real-world data.
Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value
Farahmand, Amir-massoud (Mitsubishi Electric Research Laboratories (MERL)) | Nikovski, Daniel N. (Mitsubishi Electric Research Laboratories (MERL)) | Igarashi, Yuji (Mitsubishi Electric Corporation) | Konaka, Hiroki (Mitsubishi Electric Corporation)
We propose a new class of computationally fast algorithms to find close to optimal policy for Markov Decision Processes (MDP) with large finite horizon T.The main idea is that instead of planning until the time horizon T, we plan only up to a truncated horizon H << T and use an estimate of the true optimal value function as the terminal value. Our approach of finding the terminal value function is to learn a mapping from an MDP to its value function by solving many similar MDPs during a training phase and fit a regression estimator. We analyze the method by providing an error propagation theorem that shows the effect of various sources of errors to the quality of the solution. We also empirically validate this approach in a real-world application of designing an energy management system for Hybrid Electric Vehicles with promising results.
Convolution Kernels for Discriminative Learning from Streaming Text
Lukasik, Michal (University of Sheffield) | Cohn, Trevor (University of Melbourne)
Time series modeling is an important problem with many applications in different domains. Here we consider discriminative learning from time series, where we seek to predict an output response variable based on time series input. We develop a method based on convolution kernels to model discriminative learning over streams of text. Our method outperforms competitive baselines in three synthetic and two real datasets, rumour frequency modeling and popularity prediction tasks.
Modeling Evolving Relationships Between Characters in Literary Novels
Chaturvedi, Snigdha (University of Maryland, College Park) | Srivastava, Shashank (Carnegie Mellon University) | III, Hal Daume (University of Maryland, College Park) | Dyer, Chris (Carnegie Mellon University)
Studying characters plays a vital role in computationally representing and interpreting narratives. Unlike previous work, which has focused on inferring character roles, we focus on the problem of modeling their relationships. Rather than assuming a fixed relationship for a character pair, we hypothesize that relationships temporally evolve with the progress of the narrative, and formulate the problem of relationship modeling as a structured prediction problem. We propose a semi-supervised framework to learn relationship sequences from fully as well as partially labeled data. We present a Markovian model capable of accumulating historical beliefs about the relationship and status changes. We use a set of rich linguistic and semantically motivated features that incorporate world knowledge to investigate the textual content of narrative. We empirically demonstrate that such a framework outperforms competitive baselines.
Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions
Clark, Peter (Allen Institute for AI) | Etzioni, Oren (Allen Institute for AI) | Khot, Tushar (Allen Institute for AI) | Sabharwal, Ashish (Allen Institute for AI) | Tafjord, Oyvind (Allen Institute for AI) | Turney, Peter (Allen Institute for AI) | Khashabi, Daniel (Univ. Illinois at Urbana-Champaign)
What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results. We evaluate the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam (using only non-diagram, multiple choice questions), and show that our overall system’s score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. We conclude with a detailed analysis, illustrating the complementary strengths of each method in the ensemble. Our datasets are being released to enable further research.