Goto

Collaborating Authors

 Directed Networks


Open-Universe Weighted Model Counting: Extended Abstract

AAAI Conferences

Weighted model counting (WMC) has recently emerged as an effective and general approach to probabilistic inference, offering a computational framework for encoding a variety of formalisms, such as factor graphs and Bayesian networks.The advent of large-scale probabilistic knowledge bases has generated further interest in relational probabilistic representations, obtained by according weights to first-order formulas, whose semantics is given in terms of the ground theory, and solved by WMC. A fundamental limitation is that the domain of quantification, by construction and design, is assumed to be finite, which is at odds with areas such as vision and language understanding, where the existence of objects must be inferred from raw data. Dropping the finite-domain assumption has been known to improve the expressiveness of a first-order language for open-universe purposes, but these languages, so far, have eluded WMC approaches. In this paper, we revisit relational probabilistic models over an infinite domain, and establish a number of results that permit effective algorithms. We demonstrate this language on a number of examples, including a parameterized version of Pearl's Burglary-Earthquake-Alarm Bayesian network.


Partial Observability in Grammar Based Plan Recognition

AAAI Conferences

Prior work on viewing plan recognition as parsing of grammars has assumed completely observable actions. This paper provides an algorithm to rewrite plan grammars to allow for recognizing partially observable actions. ย For the ELEXIR (Geib 2009) system, the impact of this rewriting on plan recognition runtime is shown to be limited to those plans that actually use the partially observable actions.


Goal Recognition with Noisy Observations

AAAI Conferences

It may (2010) to estimate the probability of each possible goal be that one agent needs to monitor the activities of another based on the difference between the cost of the best plan agent, attempt to assist the other agent, or simply avoid getting for the goal given the observed actions, Cost(G O), and the in the way while performing its own duties. For all of cost of the best plan for the goal without the observed actions, these cases the agent needs to be able to realize what the Cost(G O). The big difference here is that the observations other agent is doing. In the absence of full and timely communication only indirectly give us probabilities for actions in of plans and goals, goal and plan recognition becomes the plan graph. We therefore first construct a Bayesian Network essential. Many goal recognition techniques allow the (BN) to estimate these action probabilities, and then sequence of observations to be incomplete, but few consider use this probability information in the plan graph to compute the possibility of noisy observations. In practice, this is not expected cost for each goal, given the observations.


Scalable Score Computation for Learning Multinomial Bayesian Networks over Distributed Data

AAAI Conferences

In this paper, we focus on the problem of learning a Bayesian network over distributed data stored in a commodity cluster. Specifically, we address the challenge of computing the scoring function over distributed data in a scalable manner, which is a fundamental task during learning. We propose a novel approach designed to achieve: (a) scalable score computation using the principle of gossiping; (b) lower resource consumption via a probabilistic approach for maintaining scores using the properties of a Markov chain; and (c) effective distribution of tasks during score computation (on large datasets) by synergistically combining well-known hashing techniques. Through theoretical analysis, we show that our approach is superior to a MapReduce-style computation in terms of communication bandwidth. Further, it is superior to the batch-style processing of MapReduce for recomputing scores when new data are available.


The Intersection Between the Top Data Mining Algorithms and AI - DZone Big Data

#artificialintelligence

In 2007, a team of professors from the IEEE Conference on Data Mining posted a survey paper on the top 10 data mining algorithms. Some of these algorithms are playing a very important role in the future of artificial intelligence. According to this GetResponse blog, it is playing an influential role in marketing. "The technology is of course already there: artificial intelligence is no longer a sci-fi movie thing, but allows you to even automate creativity. Custom audiences and re-targeting options are now a must in advertising."


Probabilistic Sensor Fusion for Ambient Assisted Living

arXiv.org Machine Learning

There is a widely-accepted need to revise current forms of healthcare provision, with particular interest in sensing systems in the home. Given a multiple-modality sensor platform with heterogeneous network connectivity, as is under development in the Sensor Platform for HEalthcare in Residential Environment (SPHERE) Interdisciplinary Research Collaboration (IRC), we face specific challenges relating to the fusion of the heterogeneous sensor modalities. We introduce Bayesian models for sensor fusion, which aims to address the challenges of fusion of heterogeneous sensor modalities. Using this approach we are able to identify the modalities that have most utility for each particular activity, and simultaneously identify which features within that activity are most relevant for a given activity. We further show how the two separate tasks of location prediction and activity recognition can be fused into a single model, which allows for simultaneous learning an prediction for both tasks. We analyse the performance of this model on data collected in the SPHERE house, and show its utility. We also compare against some benchmark models which do not have the full structure, and show how the proposed model compares favourably to these methods.


Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning

arXiv.org Machine Learning

A common problem in disciplines of applied Statistics research such as Astrostatistics is of estimating the posterior distribution of relevant parameters. Typically, the likelihoods for such models are computed via expensive experiments such as cosmological simulations of the universe. An urgent challenge in these research domains is to develop methods that can estimate the posterior with few likelihood evaluations. In this paper, we study active posterior estimation in a Bayesian setting when the likelihood is expensive to evaluate. Existing techniques for posterior estimation are based on generating samples representative of the posterior. Such methods do not consider efficiency in terms of likelihood evaluations. In order to be query efficient we treat posterior estimation in an active regression framework. We propose two myopic query strategies to choose where to evaluate the likelihood and implement them using Gaussian processes. Via experiments on a series of synthetic and real examples we demonstrate that our approach is significantly more query efficient than existing techniques and other heuristics for posterior estimation.


Bayesian models in R (Code examples)

#artificialintelligence

In statistics, making decisions always involves some amount of uncertainties. This could be due to the unknown parameters or quantities. For example if a company is releasing a product in the market, the population who will be activity seeking the product and the amount of market the product will capture compared to other products are uncertainties. Bayesian analysis can be applied in statistics when probability has uncertainty in the statistical model. Bayesian analysis can also be applied as an elastic augmentation of maximum likelihood.


The Algorithms Behind Probabilistic Programming

#artificialintelligence

Morever, these algorithms are robust, so don't require problem-specific hand-tuning. One powerful example is sampling from an arbitrary probability distribution, which we need to do often (and efficiently!) when doing inference. The brute force approach, rejection sampling, is problematic because acceptance rates are low: as only a tiny fraction of attempts generate successful samples, the algorithms are slow and inefficient. See this post by Jeremey Kun for further details. Until recently, the main alternative to this naive approach was Markov Chain Monte Carlo sampling (of which Metropolis Hastings and Gibbs sampling are well-known examples). If you used Bayesian inference in the 90s or early 2000s, you may remember BUGS (and WinBUGS) or JAGS, which used these methods. These remain popular teaching tools (see e.g.


Exploration and Exploitation of Victorian Science in Darwin's Reading Notebooks

arXiv.org Artificial Intelligence

Search in an environment with an uncertain distribution of resources involves a trade-off between exploitation of past discoveries and further exploration. This extends to information foraging, where a knowledge-seeker shifts between reading in depth and studying new domains. To study this decision-making process, we examine the reading choices made by one of the most celebrated scientists of the modern era: Charles Darwin. From the full-text of books listed in his chronologically-organized reading journals, we generate topic models to quantify his local (text-to-text) and global (text-to-past) reading decisions using Kullback-Liebler Divergence, a cognitively-validated, information-theoretic measure of relative surprise. Rather than a pattern of surprise-minimization, corresponding to a pure exploitation strategy, Darwin's behavior shifts from early exploitation to later exploration, seeking unusually high levels of cognitive surprise relative to previous eras. These shifts, detected by an unsupervised Bayesian model, correlate with major intellectual epochs of his career as identified both by qualitative scholarship and Darwin's own self-commentary. Our methods allow us to compare his consumption of texts with their publication order. We find Darwin's consumption more exploratory than the culture's production, suggesting that underneath gradual societal changes are the explorations of individual synthesis and discovery. Our quantitative methods advance the study of cognitive search through a framework for testing interactions between individual and collective behavior and between short- and long-term consumption choices. This novel application of topic modeling to characterize individual reading complements widespread studies of collective scientific behavior.