Learning Graphical Models
Symbolic Variable Elimination for Discrete and Continuous Graphical Models
Sanner, Scott (NICTA and Australian National University) | Abbasnejad, Ehsan (Australian National University and NICTA)
Probabilistic reasoning in the real-world often requires inference incontinuous variable graphical models, yet there are few methods for exact, closed-form inference when joint distributions are non-Gaussian. To address this inferential deficit, we introduce SVE -- a symbolic extension of the well-known variable elimination algorithm to perform exact inference in an expressive class of mixed discrete and continuous variable graphical models whose conditional probability functions can be well-approximated as piecewise combinations of polynomials with bounded support. Using this representation, we show that we can compute all of the SVE operations exactly and in closed-form, which crucially includes definite integration w.r.t. multivariate piecewise polynomial functions. To aid in the efficient computation and compact representation of this solution, we use an extended algebraic decision diagram (XADD) data structure that supports all SVE operations. We provide illustrative results for SVE on probabilistic inference queries inspired by robotics localization and tracking applications that mix various continuous distributions; this represents the first time a general closed-form exact solution has been proposed for this expressive class of discrete/continuous graphical models.
Covering Number as a Complexity Measure for POMDP Planning and Learning
Zhang, Zongzhang (University of Science and Technology of China) | Littman, Michael (Rutgers University) | Chen, Xiaoping (University of Science and Technology of China)
Finding a meaningful way of characterizing the difficulty of partially observable Markov decision processes (POMDPs) is a core theoretical problem in POMDP research. State-space size is often used as a proxy for POMDP difficulty, but it is a weak metric at best. Existing work has shown that the covering number for the reachable belief space, which is a set of belief points that are reachable from the initial belief point, has interesting links with the complexity of POMDP planning, theoretically. In this paper, we present empirical evidence that the covering number for the reachable belief space (or just ``covering number", for brevity) is a far better complexity measure than the state-space size for both planning and learning POMDPs on several small-scale benchmark problems. We connect the covering number to the complexity of learning POMDPs by proposing a provably convergent learning algorithm for POMDPs without reset given knowledge of the covering number.
Influence-Based Abstraction for Multiagent Systems
Oliehoek, Frans Adriaan (Maastricht University) | Witwicki, Stefan J. (INESC-ID) | Kaelbling, Leslie Pack (Massachusetts Institute of Technology)
This paper presents a theoretical advance by which factored POSGs can be decomposed into local models. We formalize the interface between such local models as the influence agents can exert on one another; and we prove that this interface is sufficient for decoupling them. The resulting influence-based abstraction substantially generalizes previous work on exploiting weakly-coupled agent interaction structures. Therein lie several important contributions. First, our general formulation sheds new light on the theoretical relationships among previous approaches, and promotes future empirical comparisons that could come by extending them beyond the more specific problem contexts for which they were developed. More importantly, the influence-based approaches that we generalize have shown promising improvements in the scalability of planning for more restrictive models. Thus, our theoretical result here serves as the foundation for practical algorithms that we anticipate will bring similar improvements to more general planning contexts, and also into other domains such as approximate planning, decision-making in adversarial domains, and online learning.
Counting-MLNs: Learning Relational Structure for Decision Making
Nath, Aniruddh (University of Washington) | Richardson, Matthew (Microsoft Research)
Many first-order probabilistic models can be represented much more compactly using aggregation operations such as counting. While traditional statistical relational representations share factors across sets of interchangeable random variables, representations that explicitly model aggregations also exploit interchangeability of random variables within factors. This is especially useful in decision making settings, where an agent might need to reason about counts of the different types of objects it interacts with. Previous work on counting formulas in statistical relational representations has mostly focused on the problem of exact inference on an existing model. The problem of learning such models is largely unexplored. In this paper, we introduce Counting Markov Logic Networks (C-MLNs), an extension of Markov logic networks that can compactly represent complex counting formulas. We present a structure learning algorithm for C-MLNs; we apply this algorithm to the novel problem of generalizing natural language instructions, and to relational reinforcement learning in the Crossblock domain, in which standard MLN learning algorithms fail to find any useful structure. The C-MLN policies learned from natural language instructions are compact and intuitive, and, despite requiring no instructions on test games, win 20% more Crossblock games than a state-of-the-art algorithm for following natural language instructions.
What's in a URL? Genre Classification from URLs
Abramson, Myriam (US Naval Research Laboratory) | Aha, David W. (US Naval Research Laboratory)
The importance of URLs in the representation of a document cannot be overstated. Shorthand mnemonics such as ``wiki'' or ``blog'' are often embedded in a URL to convey its functional purpose or genre. Other mnemonics have evolved from use (e.g., a Wordpress particle is strongly suggestive of blogs). Can we leverage from this predictive power to induce the genre of a document from the representation of a URL? This paper presents a methodology for webpage genre classification from URLs which, to our knowledge, has not been previously attempted. Experiments using machine learning techniques to evaluate this claim show promising results and a novel algorithm for character n-gram decomposition is provided. Such a capability could be useful to improve personalized search results, disambiguate content, efficiently crawl the Web in search of relevant documents, and construct behavioral profiles from clickstream data without parsing the entire document.
Dynamically Switching between Synergistic Workflows for Crowdsourcing
Lin, Christopher H (University of Washington) | Mausam, . (University of Washington) | Weld, Daniel S (University of Washington)
To ensure quality results from unreliable crowdsourced workers, task designers often construct complex workflows and aggregate worker responses from redundant runs. Frequently, they create several alternative workflows to accomplish the task, and choose a single workflow to deploy (perhaps the one that achieves the best performance during early experiments). However, this seemingly natural design paradigm does not achieve the full potential of crowdsourcing. In particular, using a single workflow (even the best) to accomplish a task is suboptimal. We show that alternative workflows can compose synergistically to yield a much higher quality output. We formalize the insight with a novel probabilistic graphical model, design and implement AgentHunt, a POMDP-based controller that dynamically switches between these workflows to achieve higher returns on investment, and design offline and online methods for learning model parameters. Live experiments on Amazon Mechanical Turk demonstrate the superiority of AgentHunt for the practical task of generating NLP training data, yielding up to 50% error reduction and greater net utility compared to previous methods.
Teaching Localization in Probabilistic Robotics
Martin, Fred G. (University of Massachusetts Lowell) | Dalphond, James (University of Massachusetts Lowell) | Tuck, Nat (University of Massachusetts Lowell)
In the field of probabilistic robotics, a central problem is to determine a robot's state given knowledge of a time series of control commands and sensor readings. The effects of control commands and the behavior of sensor devices are both modeled probabilistically. A variety of methods are available for deriving the robot's belief state, which is a probabilistic representation of the robot's true state (which cannot be directly known). This paper presents a series of five assignments to teach this material at the advanced undergraduate/graduate level. The theoretical aspect of the work is reinforced by practical implementation exercises using ROS (Robot Operating System), and the Bilibot, an educational robot platform.
Complex Task Learning from Unstructured Demonstrations
Niekum, Scott (University of Massachusetts Amherst)
Much work in learning from demonstration has focused on learning simple tasks from structured demonstrations that have a well-defined beginning and end. As we attempt to scale robot learning to increasingly complex tasks, it becomes intractable to learn task policies monolithically. Furthermore, it is desirable to be able to learn from natural, unstructured demonstrations, which are unsegmented, possibly incomplete, and may come from different tasks. We propose a three-part approach to designing a natural, scalable system that allows a robot to learn tasks of increasing complexity by automatically building and refining a library of skills over time. First, we describe a Bayesian nonparametric model that can segment unstructured demonstrations into appropriate numbers of component skills and recognize repeated skills across demonstrations and tasks. These skills can then be parameterized and generalized to new situations. Second, we propose to create a system that allows the user to provide unstructured corrections and feedback to the robot, without requiring any knowledge of the robot's underlying representation of the task or its component skills. Third, we propose to infer the user's intentions for each segmented skill and autonomously improve these skills using reinforcement learning. This approach will be applied to learn and generalize complex, multi-step tasks that are beyond the reach of current LfD methods, using the PR2 mobile manipulator as a testing platform.
Combining Probabilistic Planning and Logic Programming on Mobile Robots
Zhang, Shiqi (Texas Tech University) | Bao, Forrest Sheng (Texas Tech University) | Sridharan, Mohan (Texas Tech University)
Key challenges to widespread deployment of mobile robots to interact with humans in real-world domains include the ability to: (a) robustly represent and revise domain knowledge; (b) autonomously adapt sensing and processing to the task at hand; and (c) learn from unreliable high-level human feedback. Partially observable Markov decision processes (POMDPs) have been used to plan sensing and navigation in different application domains. It is however a challenge to include common sense knowledge obtained from sensory or human inputs in POMDPs. In addition, information extracted from sensory and human inputs may have varying levels of relevance to current and future tasks. On the other hand, although a non-monotonic logic programming paradigm such as Answer Set Programming (ASP) is wellsuited for common sense reasoning, it is unable to model the uncertainty in real-world sensing and navigation (Gelfond 2008). This paper presents a hybrid framework that integrates ASP, hierarchical POMDPs (Zhang and Sridharan 2012) and psychophysics principles to address the challenges stated above. Experimental results in simulation and on mobile robots deployed in indoor domains show that the framework results in reliable and efficient operation.
A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video
Song, Young Chol (University of Rochester) | Kautz, Henry (University of Rochester)
We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGB-Depth cameras allow us to infer high-level visual features, such as the relative position of objects in space, with greater precision and less training than required by traditional systems. Speech is recognized and parsed using a “deep” parsing system, so that language features are available at the word, syntactic, and semantic levels. We collected an initial data set of 10 episodes of 7 individuals demonstrating how to “make tea”, and created a “gold standard” hand annotation of the actions performed in each. Finally, we are constructing “baseline” HMM-based activity recognition models using the visual and language features, in order to be ready to evaluate the performance of our future work on deeper and more structured models.