Learning Graphical Models
PREGO: An Action Language for Belief-Based Cognitive Robotics in Continuous Domains
Belle, Vaishak (University of Toronto) | Levesque, Hector (University of Toronto)
The area of cognitive robotics is often subject to the criticism that the proposals investigated in the literature are too far removed from the kind of continuous uncertainty and noise seen in actual real-world robotics. This paper proposes a new language and an implemented system, called PREGO, based on the situation calculus, that is able to reason effectively about degrees of belief against noisy sensors and effectors in continuous domains. It embodies the representational richness of conventional logic-based action languages, such as context-sensitive successor state axioms, but is still shown to be efficient using a number of empirical evaluations. We believe that PREGO is a powerful framework for exploring real-time reactivity and an interesting bridge between logic and probability for cognitive robotics applications.
Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes
Shen, Huawei (Chinese Academy of Sciences) | Wang, Dashun (IBM Thomas J. Watson Research Center) | Song, Chaoming (University of Miami) | Barabási, Albert-László (Northeastern University)
Indeed, to the best of our knowledge, we lack forgotten over time (Wu and Humberman 2007). For example, a probabilistic framework to model and predict the popularity videos on YouTube or stories on Digg gain their popularity dynamics of individual items. The reason behind this is by striving for views or votes (Szabo and Huberman partly illustrated in Figure 1, suggesting that the dynamical 2010); papers increase their visibility by competing for citations processes governing individual items appear too noisy to be from new papers (Ren et al. 2010; Wang, Song, and amenable to quantification. Barabási 2013); tweets or Hashtags in Twitter become more In this paper, we model the stochastic popularity dynamics popular as being retweeted (Hong, Dan, and Davison 2011) using reinforced Poisson processes, capturing simultaneously and so do webpages as being attached by incoming hyperlinks three key ingredients: fitness of an item, characterizing (Ratkiewicz et al. 2010). An ability to predict the popularity its inherent competitiveness against other items; a general of individual items within a dynamically evolving system temporal relaxation function, corresponding to the aging not only probes our understanding of complex systems, in the ability to attract new attentions; and a reinforcement but also has important implications in a wide range of domains, mechanism, documenting the well-known "rich-get-richer" from marketing and traffic control to policy making phenomenon. The benefit of the proposed model is threefold: and risk management. Despite recent advances of empirical (1) It models the arrival process of individual attentions methods, we lack a general modeling framework to predict directly in contrast to relying on aggregated popularity the popularity of individual items within a complex evolving time series; (2) As a generative probabilistic model, it can be system.
GP-Localize: Persistent Mobile Robot Localization Using Online Sparse Gaussian Process Observation Model
Xu, Nuo (National University of Singapore) | Low, Kian Hsiang (National University of Singapore) | Chen, Jie (Singapore-MIT Alliance for Research and Technology) | Lim, Keng Kiat (National University of Singapore) | Ozgul, Etkin Baris (National University of Singapore)
Central to robot exploration and mapping is the task of persistent localization in environmental fields characterized by spatially correlated measurements. This paper presents a Gaussian process localization (GP-Localize) algorithm that, in contrast to existing works, can exploit the spatially correlated field measurements taken during a robot's exploration (instead of relying on prior training data) for efficiently and scalably learning the GP observation model online through our proposed novel online sparse GP. As a result, GP-Localize is capable of achieving constant time and memory (i.e., independent of the size of the data) per filtering step, which demonstrates the practical feasibility of using GPs for persistent robot localization and autonomy. Empirical evaluation via simulated experiments with real-world datasets and a real robot experiment shows that GP-Localize outperforms existing GP localization algorithms.
Predictive Models for Determining If and When to Display Online Lead Forms
Chan, Tim (Edmunds.com) | I, Joseph (Edmunds.com) | Macasaet, Carlos (Edmunds.com) | Kang, Daniel (Edmunds.com) | Hardy, Robert M. (Edmunds.com) | Ruiz, Carlos (Edmunds.com) | Porras, Rigel (Edmunds.com) | Baron, Brian (Edmunds.com) | Qazi, Karim (Edmunds.com) | Hannon, Padraic (Edmunds.com) | Honda, Tomonori (Edmunds.com)
This paper will demonstrate a machine learning appli- cation for predicting positive lead conversion events on the Edmunds.com website, an American destination for car shopping. A positive conversion event occurs when a user fills out and submits a lead form interstitial. We used machine learning to identify which users might want to fill out lead forms, and where in their sessions to present the interstitials. There are several factors that make these predictions difficult, such as (a) far more negative than positive responses (b) seasonality effects due to car sales events near holidays, which require the model to be easily tunable and (c) the need for compu- tationally fast predictions for real-time decision-making in order to minimize any impact on the website’s us- ability. Rather than develop a single highly complex model, we used an ensemble of three simple models: Naive Bayes, Markov Chain, and Vowpal Wabbit. The ensemble generated significant lift over random predic- tions and demonstrated comparable accuracy to an ex- ternal consulting company’s model.
Content-Structural Relation Inference in Knowledge Base
Zhao, Zeya (Chinese Academy of Sciences) | Jia, Yantao (Chinese Academy of Sciences) | Wang, Yuanzhuo
Relation inference between concepts in knowledge base has been extensively studied in recent years. Previous methods mostly apply the relations in the knowledge base, without fully utilizing the contents, i.e., the attributes of concepts in knowledge base. In this paper, we propose a content-structural relation inference method (CSRI) which integrates the content and structural information between concepts for relation inference. Experiments on data sets show that CSRI obtains 15% improvement compared with the state-of-the-art methods.
A Novel Single-DBN Generative Model for Optimizing POMDP Controllers by Probabilistic Inference
Kiselev, Igor (University of Waterloo) | Poupart, Pascal (University of Waterloo)
As a promising alternative to using standard (often intractable) planning techniques with Bellman equations, we propose an interesting method of optimizing POMDP controllers by probabilistic inference in a novel equivalent single-DBN generative model. Our inference approach to POMDP planning allows for (1) for application of various techniques for probabilistic inference in single graphical models, and (2) for exploiting the factored structure in a controller architecture to take advantage of natural structural constrains of planning problems and represent them compactly. Our contributions can be summarized as follows: (1) we designed a novel single-DBN generative model that ensures that the task of probabilistic inference is equivalent to the original problem of optimizing POMDP controllers, and (2) we developed several inference approaches to approximate the value of the policy when exact inference methods are not tractable to solve large-size problems with complex graphical models. The proposed approaches to policy optimization by probabilistic inference are evaluated on several POMDP benchmark problems and the performance of the implemented approximation algorithms is compared.
A Model for Aggregating Contributions of Synergistic Crowdsourcing Workflows
Fang, Yili (Beihang University) | Sun, Hailong (Beihang University) | Zhang, Richong (Beihang University) | Huai, Jinpeng (Beihang University) | Mao, Yongyi (University of Ottawa)
One of the most important crowdsourcing topics is to study the effective quality control methods so as to reduce the cost and to guarantee the quality of task processing. As an effective approach, iterative improvement workflow is known to choose the best result from multiple workflows. However, for complex crowdsourcing tasks that consists of a certain number of subtasks under some specific constraints, but cannot be split into subtasks to be crowdsourced, the approach merely considers the best workflow without integrating the contributions of all workflows, which potentially results in extra costs for more iterations. In this paper, we propose an assembly model to integrate the best output of subtasks from different workflows. Moreover, we devise an efficient iterative method based on POMDP to improve the quality of assembled output. Empirical studies confirms the superiority of our proposed model.
Optimal and Efficient Stochastic Motion Planning in Partially-Known Environments
Luna, Ryan J (Rice University) | Lahijanian, Morteza (Rice University) | Moll, Mark (Rice University) | Kavraki, Lydia E (Rice University)
A framework capable of computing optimal control policies for a continuous system in the presence of both action and environment uncertainty is presented in this work. The framework decomposes the planning problem into two stages: an offline phase that reasons only over action uncertainty and an online phase that quickly reacts to the uncertain environment. Offline, a bounded-parameter Markov decision process (BMDP) is employed to model the evolution of the stochastic system over a discretization of the environment. Online, an optimal control policy over the BMDP is computed. Upon the discovery of an unknown environment feature during policy execution, the BMDP is updated and the optimal control policy is efficiently recomputed. Depending on the desired quality of the control policy, a suite of methods is presented to incorporate new information into the BMDP with varying degrees of detail online. Experiments confirm that the framework recomputes high-quality policies in seconds and is orders of magnitude faster than existing methods.
Point-Based POMDP Solving with Factored Value Function Approximation
Veiga, Tiago (Universidade de Lisboa) | Spaan, Matthijs (Delft University of Technology) | Lima, Pedro (Universidade de Lisboa)
Partially observable Markov decision processes (POMDPs) provide a principled mathematical framework for modeling autonomous decision-making problems. A POMDP solution is often represented by a value function comprised of a set of vectors. In the case of factored models, the size of these vectors grows exponentially with the number of state factors, leading to scalability issues. We consider an approximate value function representation based on a linear combination of basis functions. In particular, we present a backup operator that can be used in any point-based POMDP solver. Furthermore, we show how under certain conditions independence between observation factors can be exploited for large computational gains. We experimentally verify our contributions and show that they have the potential to improve point-based methods in policy quality and solution size.
An Adversarial Interpretation of Information-Theoretic Bounded Rationality
Ortega, Pedro A. (University of Pennsylvania) | Lee, Daniel D. (University of Pennsylvania)
Recently, there has been a growing interest in modeling planning with information constraints. Accordingly, an agent maximizes a regularized expected utility known as the free energy, where the regularizer is given by the information divergence from a prior to a posterior policy. While this approach can be justified in various ways, including from statistical mechanics and information theory, it is still unclear how it relates to decision-making against adversarial environments. This connection has previously been suggested in work relating the free energy to risk-sensitive control and to extensive form games. Here, we show that a single-agent free energy optimization is equivalent to a game between the agent and an imaginary adversary. The adversary can, by paying an exponential penalty, generate costs that diminish the decision maker's payoffs. It turns out that the optimal strategy of the adversary consists in choosing costs so as to render the decision maker indifferent among its choices, which is a definining property of a Nash equilibrium, thus tightening the connection between free energy optimization and game theory.