Undirected Networks
Relational Markov Decision Processes: Promise and Prospects
Joshi, Saket ( Cycorp, Inc. ) | Khardon, Roni (Tufts University) | Tadepalli, Prasad (Oregon State University) | Fern, Alan (Oregon State University) | Raghavan, Aswin (Oregon State University)
Relational Markov Decision Processes (RMDPs) offer an elegant formalism that combines probabilistic and relational knowledge representations with the decision-theoretic notions of action and utility. In this paper we motivate RMDPs to address a variety of problems in AI, including open world planning, transfer learning, and relational inference. We describe a symbolic dynamic programming approach via the "template method" which addresses the problem of reasoning about exogenous events. We end with a discussion of the challenges involved and some promising future research directions.
A General Framework for Recognizing Complex Events in Markov Logic
Song, Young Chol (University of Rochester) | Kautz, Henry (University of Rochester) | Li, Yuncheng (University of Rochester) | Luo, Jiebo (University of Rochester)
We present a robust framework for complex event recognition that is well-suited for integrating information that varies widely in detail and granularity. Consider the scenario of an agent in an instrumented space performing a complex task while describing what he is doing in a natural manner. The system takes in a variety of information, including objects and gestures recognized by RGB-D and descriptions of events extracted from recognized and parsed speech. The system outputs a complete reconstruction of the agentโs plan, explaining actions in terms of more complex activities and filling in unobserved but necessary events. We show how to use Markov Logic (a probabilistic extension to first order logic) to create a theory in which observations can be partial, noisy, and refer to future or temporally ambiguous events; complex events are composed from simpler events in a manner that exposes their structure for inference and learning; and uncertainty is handled in a sound probabilistic manner. We demonstrate the effectiveness of the approach for tracking cooking plans in the presence of noisy and incomplete observations.
Accuracy and Timeliness in ML Based Activity Recognition
Ross, Robert (Dublin Institute of Technology) | Kelleher, John (Dublin Institute of Technology)
While recent Machine Learning (ML) based techniques for activity recognition show great promise, there remain a number of questions with respect to the relative merits of these techniques. To provide a better understanding of the relative strengths of contemporary Activity Recognition methods, in this paper we present a comparative analysis of Hidden Markov Model, Bayesian, and Support Vector Machine based human activity recognition models. The study builds on both pre-existing and newly annotated data which includes interleaved activities. Results demonstrate that while Support Vector Machine based techniques perform well for all data sets considered, simple representations of sensor histories regularly outperform more complex count based models.
Autonomous Hierarchical POMDP Planning from Low-Level Sensors
Squire, Shawn (University of Maryland, Baltimore County) | desJardins, Marie (University of Maryland, Baltimore County)
There are currently no strong methods for planning in a stochastic domain, with low-level sensors that are limited and possibly inaccurate. Existing architectures have flaws that make their use in a real-world environment impractical. We propose an architecture that utilizes POMDPs to create a hierarchical planning system. This system is capable of developing macro-actions that can expedite planning on a large scale, and can learn new plans quickly and efficiently, without deliberate design by the programmer.
Exploring Disease Interactions Using Markov Networks
Haaren, Jan Van (Katholieke Universiteit Leuven) | Davis, Jesse (Katholieke Universiteit Leuven) | Lappenschaar, Martijn (Radboud Universiteit Nijmegen) | Hommersom, Arjen (Radboud Universiteit Nijmegen)
Network medicine is an emerging paradigm for studying the co-occurrence between diseases. While diseases are often interlinked through complex patterns, most of the existing work in this area has focused on studying pairwise relationships between diseases. In this paper, we use a state-of-the-art Markov network learning method to learn interactions between musculoskeletal disorders and cardiovascular diseases and compare this to pairwise approaches. Our experimental results confirm that the sophisticated structure learner produces more accurate models, which can help reveal interesting patterns in the co-occurrence of diseases.
Multi-Target Detection and Recognition by UAVs Using Online POMDPs
Chanel, Caroline P. Carvalho (ISAE) | Teichteil-Kรถnigsbuch, Florent (ONERA) | Lesire, Charles (ONERA)
This paper tackles high-level decision-making techniques for robotic missions, which involve both active sensing and symbolic goal reaching, under uncertain probabilistic environments and strong time constraints. Our case study is a POMDP model of an online multi-target detection and recognition mission by an autonomous UAV. The POMDP model of the multi-target detection and recognition problem is generated online from a list of areas of interest, which are automatically extracted at the beginning of the flight from a coarse-grained high altitude observation of the scene. The POMDP observation model relies on a statistical abstraction of an image processing algorithm's output used to detect targets. As the POMDP problem cannot be known and thus optimized before the beginning of the flight, our main contribution is an "optimize-while-execute" algorithmic framework: it drives a POMDP sub-planner to optimize and execute the POMDP policy in parallel under action duration constraints. We present new results from real outdoor flights and SAIL simulations, which highlight both the benefits of using POMDPs in multi-target detection and recognition missions, and of our "optimize-while-execute" paradigm.
Enforcing Meter in Finite-Length Markov Sequences
Roy, Pierre (Associate Researcher) | Pachet, Francois (Sony CSL Paris)
Markov processes are increasingly used to generate finite-length sequences that imitate a given style. However, Markov processes are notoriously difficult to control. Recently, Markov constraints have been introduced to give users some control on generated sequences. Markov constraints reformulate finite-length Markov sequence generation in the framework of constraint satisfaction (CSP). However, in practice, this approach is limited to local constraints and its performance is low for global constraints, such as cardinality or arithmetic constraints. This limitation prevents generated sequences to follow structural properties which are independent of the style, but inherent to the domain, such as meter. In this article, we introduce meter, a constraint that ensures a sequence is 1) Markovian with regards to a given corpus and 2) follows metrical rules expressed as cumulative cost functions. Additionally, meter can simultaneously enforce cardinality constraints. We propose a domain consistency algorithm whose complexity is pseudo-polynomial. This result is obtained thanks to a theorem on the growth of sumsets by Khovanskii. We illustrate our constraint on meter-constrained music generation problems that were so far not solvable by any other technique.
Sample Complexity and Performance Bounds for Non-Parametric Approximate Linear Programming
Pazis, Jason (Duke University) | Parr, Ronald (Duke University)
One of the most difficult tasks in value function approximation for Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. Recent results in non-parametric approximate linear programming (NP-ALP), have demonstrated that this can be done effectively using nothing more than a smoothness assumption on the value function. In this paper we extend these results to the case where samples come from real world transitions instead of the full Bellman equation, adding robustness to noise. In addition, we provide the first max-norm, finite sample performance guarantees for any form of ALP. NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution.
PAC Optimal Exploration in Continuous Space Markov Decision Processes
Pazis, Jason (Duke University) | Parr, Ronald (Duke University)
Current exploration algorithms can be classified in two broad categories: Heuristic, and PAC optimal. While numerous researchers have used heuristic approaches such as epsilon-greedy exploration successfully, such approaches lack formal, finite sample guarantees and may need a significant amount of fine-tuning to produce good results. PAC optimal exploration algorithms, on the other hand, offer strong theoretical guarantees but are inapplicable in domains of realistic size. The goal of this paper is to bridge the gap between theory and practice, by introducing C-PACE, an algorithm which offers strong theoretical guarantees and can be applied to interesting, continuous space problems.
Dynamic Social Choice with Evolving Preferences
Parkes, David C. (Harvard University) | Procaccia, Ariel D. (Carnegie Mellon University)
Social choice theory provides insights into a variety of collective decision making settings, but nowadays some of its tenets are challenged by internet environments, which call for dynamic decision making under constantly changing preferences. In this paper we model the problem via Markov decision processes (MDP), where the states of the MDP coincide with preference profiles and a (deterministic, stationary) policy corresponds to a social choice function. We can therefore employ the axioms studied in the social choice literature as guidelines in the design of socially desirable policies. We present tractable algorithms that compute optimal policies under different prominent social choice constraints. Our machinery relies on techniques for exploiting symmetries and isomorphisms between MDPs.