Goto

Collaborating Authors

 Bayesian Learning


Bayesian Nonparametric Crowdsourcing

arXiv.org Machine Learning

Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms.


Church: a language for generative models

arXiv.org Artificial Intelligence

We introduce Church, a universal language for describing stochastic generative processes. Church is based on the Lisp model of lambda calculus, containing a pure Lisp as its deterministic subset. The semantics of Church is defined in terms of evaluation histories and conditional distributions on such histories. Church also includes a novel language construct, the stochastic memoizer, which enables simple description of many complex non-parametric models. We illustrate language features through several examples, including: a generalized Bayes net in which parameters cluster over trials, infinite PCFGs, planning by inference, and various non-parametric clustering models. Finally, we show how to implement query on any Church program, exactly and approximately, using Monte Carlo techniques.


Labeling Complicated Objects: Multi-View Multi-Instance Multi-Label Learning

AAAI Conferences

Multi-Instance Multi-Label (MIML) is a learning framework where an example is associated with multiple labels and represented by a set of feature vectors (multiple instances). In the formalization of MIML learning, instances come from a single source (single view). To leverage multiple information sources (multi-view), we develop a multi-view MIML framework based on hierarchical Bayesian Network, and derive an effective learning algorithm based on variational inference. The model can naturally deal with examples in which some views could be absent (partial examples). On multi-view datasets, it is shown that our method is better than other multi-view and single-view approaches particularly in the presence of partial examples. On single-view benchmarks, extensive evaluation shows that our method is highly competitive or better than other MIML approaches on labeling examples and instances. Moreover, our method can effectively handle datasets with a large number of labels.


PREGO: An Action Language for Belief-Based Cognitive Robotics in Continuous Domains

AAAI Conferences

The area of cognitive robotics is often subject to the criticism that the proposals investigated in the literature are too far removed from the kind of continuous uncertainty and noise seen in actual real-world robotics. This paper proposes a new language and an implemented system, called PREGO, based on the situation calculus, that is able to reason effectively about degrees of belief against noisy sensors and effectors in continuous domains. It embodies the representational richness of conventional logic-based action languages, such as context-sensitive successor state axioms, but is still shown to be efficient using a number of empirical evaluations. We believe that PREGO is a powerful framework for exploring real-time reactivity and an interesting bridge between logic and probability for cognitive robotics applications.


Modeling and Predicting Popularity Dynamics via Reinforced Poisson Processes

AAAI Conferences

Indeed, to the best of our knowledge, we lack forgotten over time (Wu and Humberman 2007). For example, a probabilistic framework to model and predict the popularity videos on YouTube or stories on Digg gain their popularity dynamics of individual items. The reason behind this is by striving for views or votes (Szabo and Huberman partly illustrated in Figure 1, suggesting that the dynamical 2010); papers increase their visibility by competing for citations processes governing individual items appear too noisy to be from new papers (Ren et al. 2010; Wang, Song, and amenable to quantification. Barabรกsi 2013); tweets or Hashtags in Twitter become more In this paper, we model the stochastic popularity dynamics popular as being retweeted (Hong, Dan, and Davison 2011) using reinforced Poisson processes, capturing simultaneously and so do webpages as being attached by incoming hyperlinks three key ingredients: fitness of an item, characterizing (Ratkiewicz et al. 2010). An ability to predict the popularity its inherent competitiveness against other items; a general of individual items within a dynamically evolving system temporal relaxation function, corresponding to the aging not only probes our understanding of complex systems, in the ability to attract new attentions; and a reinforcement but also has important implications in a wide range of domains, mechanism, documenting the well-known "rich-get-richer" from marketing and traffic control to policy making phenomenon. The benefit of the proposed model is threefold: and risk management. Despite recent advances of empirical (1) It models the arrival process of individual attentions methods, we lack a general modeling framework to predict directly in contrast to relying on aggregated popularity the popularity of individual items within a complex evolving time series; (2) As a generative probabilistic model, it can be system.


Content-Structural Relation Inference in Knowledge Base

AAAI Conferences

Relation inference between concepts in knowledge base has been extensively studied in recent years. Previous methods mostly apply the relations in the knowledge base, without fully utilizing the contents, i.e., the attributes of concepts in knowledge base. In this paper, we propose a content-structural relation inference method (CSRI) which integrates the content and structural information between concepts for relation inference. Experiments on data sets show that CSRI obtains 15% improvement compared with the state-of-the-art methods.


Predicting the Hardness of Learning Bayesian Networks

AAAI Conferences

There are various algorithms for finding a Bayesian networkstructure (BNS) that is optimal with respect to a given scoring function. No single algorithm dominates the others in speed, and, given a problem instance, it is a priori unclear which algorithm will perform best and how fast it will solve the problem. Estimating the runtimes directly is extremely difficult as they are complicated functions of the instance. The main contribution of this paper is characterization of the empirical hardness of an instance for a given algorithm based on a novel collection of non-trivial, yet efficiently computable features. Our empirical results, based on the largest evaluation of state-of-the-art BNS learning algorithms to date, demonstrate that we can predict the runtimes to a reasonable degree of accuracy, and effectively select algorithms that perform well on a particular instance. Moreover, we also show how the results can be utilized in building a portfolio algorithm that combines several individual algorithms in an almost optimal manner.


Tightening Bounds for Bayesian Network Structure Learning

AAAI Conferences

A recent breadth-first branch and bound algorithm (BFBnB)for learning Bayesian network structures (Maloneet al. 2011) uses two bounds to prune the searchspace for better efficiency; one is a lower bound calculatedfrom pattern database heuristics, and the otheris an upper bound obtained by a hill climbing search.Whenever the lower bound of a search path exceeds theupper bound, the path is guaranteed to lead to suboptimalsolutions and is discarded immediately. This paperintroduces methods for tightening the bounds. Thelower bound is tightened by using more informed variablegroupings when creating the pattern databases, andthe upper bound is tightened using an anytime learningalgorithm. Empirical results show that these boundsimprove the efficiency of Bayesian network learning bytwo to three orders of magnitude.


Finding the k-best Equivalence Classes of Bayesian Network Structures for Model Averaging

AAAI Conferences

In this paper we develop an algorithm to find the k-best equivalence classes of Bayesian networks. Our algorithm is capable of finding much more best DAGs than the previous algorithm that directly finds the k-best DAGs (Tian, He and Ram 2010). We demonstrate our algorithm in the task of Bayesian model averaging. Empirical results show that our algorithm significantly outperforms the k-best DAG algorithm in both time and space to achieve the same quality of approximation. Our algorithm goes beyond the maximum-a-posteriori (MAP) model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery.


Recovering from Selection Bias in Causal and Statistical Inference

AAAI Conferences

Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies. In this paper, we provide complete graphical and algorithmic conditions for recovering conditional probabilities from selection biased data. We also provide graphical conditions for recoverability when unbiased data is available over a subset of the variables. Finally, we provide a graphical condition that generalizes the backdoor criterion and serves to recover causal effects when the data is collected under preferential selection.