Learning Graphical Models
Complementing the Execution of AI Systems with Human Computation
Kamar, Ece (Microsoft Research) | Manikonda, Lydia (Arizona State University)
For a multitude of tasks that come naturally to humans, performance of AI systems is inferior to human level performance. We show how human intellect made available via crowdsourcing can be used to complement an existing system during execution. We introduce a hybrid workflow that queries people to verify and correct the output of the system and present a simulation-based workflow optimization method to balance the cost of human input with the expected improvement in performance. Through empirical evaluations on an image captioning system, we show that the hybrid system, which combines the AI system with human input, significantly outperforms the automated system by properly trading off the cost of human input with expected benefit. Finally, we show that human input collected at execution time can be used to teach the system about its errors and limitations.
Energy Disaggregation Methods for Commercial Buildings Using Smart Meter and Operational Data
Bansal, Shubham (École Polytechnique Fédérale de Lausanne) | Schmidt, Mischa ( NEC Laboratories Europe )
One of the key information pieces in improving energy efficiency of buildings is the appliance level breakdown of energy consumption. Energy disaggregation is the process of obtaining this breakdown from a building level aggregate data using computational techniques. Most of the current research focuses on residential buildings, obtaining this information from a single smart meter and often relying on high frequency data. This work is directed at commercial buildings equipped with building management and automation systems providing low frequency operational and contextual data. This paper presents a machine learning method to disaggregate energy consumption of the building using this operational data as input features. Experimental results on two publicly available datasets demonstrate the effectiveness of the approach, which surpasses existing methods. For all but one appliance of House 2 of the publicly available REDD dataset, improvements in normalized error in assigned power range between 20% (Lighting) and 220% (Stove). For another dataset from an educational facility in Singapore, disaggregation accuracy of 92% is reported for the facility's cooling system.
The Intersection Between the Top Data Mining Algorithms and AI - DZone Big Data
In 2007, a team of professors from the IEEE Conference on Data Mining posted a survey paper on the top 10 data mining algorithms. Some of these algorithms are playing a very important role in the future of artificial intelligence. According to this GetResponse blog, it is playing an influential role in marketing. "The technology is of course already there: artificial intelligence is no longer a sci-fi movie thing, but allows you to even automate creativity. Custom audiences and re-targeting options are now a must in advertising."
Probabilistic Sensor Fusion for Ambient Assisted Living
Diethe, Tom, Twomey, Niall, Kull, Meelis, Flach, Peter, Craddock, Ian
There is a widely-accepted need to revise current forms of healthcare provision, with particular interest in sensing systems in the home. Given a multiple-modality sensor platform with heterogeneous network connectivity, as is under development in the Sensor Platform for HEalthcare in Residential Environment (SPHERE) Interdisciplinary Research Collaboration (IRC), we face specific challenges relating to the fusion of the heterogeneous sensor modalities. We introduce Bayesian models for sensor fusion, which aims to address the challenges of fusion of heterogeneous sensor modalities. Using this approach we are able to identify the modalities that have most utility for each particular activity, and simultaneously identify which features within that activity are most relevant for a given activity. We further show how the two separate tasks of location prediction and activity recognition can be fused into a single model, which allows for simultaneous learning an prediction for both tasks. We analyse the performance of this model on data collected in the SPHERE house, and show its utility. We also compare against some benchmark models which do not have the full structure, and show how the proposed model compares favourably to these methods.
Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning
Kandasamy, Kirthevasan, Schneider, Jeff, Póczos, Barnabás
A common problem in disciplines of applied Statistics research such as Astrostatistics is of estimating the posterior distribution of relevant parameters. Typically, the likelihoods for such models are computed via expensive experiments such as cosmological simulations of the universe. An urgent challenge in these research domains is to develop methods that can estimate the posterior with few likelihood evaluations. In this paper, we study active posterior estimation in a Bayesian setting when the likelihood is expensive to evaluate. Existing techniques for posterior estimation are based on generating samples representative of the posterior. Such methods do not consider efficiency in terms of likelihood evaluations. In order to be query efficient we treat posterior estimation in an active regression framework. We propose two myopic query strategies to choose where to evaluate the likelihood and implement them using Gaussian processes. Via experiments on a series of synthetic and real examples we demonstrate that our approach is significantly more query efficient than existing techniques and other heuristics for posterior estimation.
Energy Prediction using Spatiotemporal Pattern Networks
Jiang, Zhanhong, Liu, Chao, Akintayo, Adedotun, Henze, Gregor, Sarkar, Soumik
This paper presents a novel data-driven technique based on the spatiotemporal pattern network (STPN) for energy/power prediction for complex dynamical systems. Built on symbolic dynamic filtering, the STPN framework is used to capture not only the individual system characteristics but also the pair-wise causal dependencies among different sub-systems. For quantifying the causal dependency, a mutual information based metric is presented. An energy prediction approach is subsequently proposed based on the STPN framework. For validating the proposed scheme, two case studies are presented, one involving wind turbine power prediction (supply side energy) using the Western Wind Integration data set generated by the National Renewable Energy Laboratory (NREL) for identifying the spatiotemporal characteristics, and the other, residential electric energy disaggregation (demand side energy) using the Building America 2010 data set from NREL for exploring the temporal features. In the energy disaggregation context, convex programming techniques beyond the STPN framework are developed and applied to achieve improved disaggregation performance.
Edge-exchangeable graphs and sparsity (NIPS 2016)
Cai, Diana, Campbell, Trevor, Broderick, Tamara
Many popular network models rely on the assumption of (vertex) exchangeability, in which the distribution of the graph is invariant to relabelings of the vertices. However, the Aldous-Hoover theorem guarantees that these graphs are dense or empty with probability one, whereas many real-world graphs are sparse. We present an alternative notion of exchangeability for random graphs, which we call edge exchangeability, in which the distribution of a graph sequence is invariant to the order of the edges. We demonstrate that edge-exchangeable models, unlike models that are traditionally vertex exchangeable, can exhibit sparsity. To do so, we outline a general framework for graph generative models; by contrast to the pioneering work of Caron and Fox (2015), models within our framework are stationary across steps of the graph sequence. In particular, our model grows the graph by instantiating more latent atoms of a single random measure as the dataset size increases, rather than adding new atoms to the measure.
Bayesian models in R (Code examples)
In statistics, making decisions always involves some amount of uncertainties. This could be due to the unknown parameters or quantities. For example if a company is releasing a product in the market, the population who will be activity seeking the product and the amount of market the product will capture compared to other products are uncertainties. Bayesian analysis can be applied in statistics when probability has uncertainty in the statistical model. Bayesian analysis can also be applied as an elastic augmentation of maximum likelihood.
The Algorithms Behind Probabilistic Programming
Morever, these algorithms are robust, so don't require problem-specific hand-tuning. One powerful example is sampling from an arbitrary probability distribution, which we need to do often (and efficiently!) when doing inference. The brute force approach, rejection sampling, is problematic because acceptance rates are low: as only a tiny fraction of attempts generate successful samples, the algorithms are slow and inefficient. See this post by Jeremey Kun for further details. Until recently, the main alternative to this naive approach was Markov Chain Monte Carlo sampling (of which Metropolis Hastings and Gibbs sampling are well-known examples). If you used Bayesian inference in the 90s or early 2000s, you may remember BUGS (and WinBUGS) or JAGS, which used these methods. These remain popular teaching tools (see e.g.
Exploration and Exploitation of Victorian Science in Darwin's Reading Notebooks
Murdock, Jaimie, Allen, Colin, DeDeo, Simon
Search in an environment with an uncertain distribution of resources involves a trade-off between exploitation of past discoveries and further exploration. This extends to information foraging, where a knowledge-seeker shifts between reading in depth and studying new domains. To study this decision-making process, we examine the reading choices made by one of the most celebrated scientists of the modern era: Charles Darwin. From the full-text of books listed in his chronologically-organized reading journals, we generate topic models to quantify his local (text-to-text) and global (text-to-past) reading decisions using Kullback-Liebler Divergence, a cognitively-validated, information-theoretic measure of relative surprise. Rather than a pattern of surprise-minimization, corresponding to a pure exploitation strategy, Darwin's behavior shifts from early exploitation to later exploration, seeking unusually high levels of cognitive surprise relative to previous eras. These shifts, detected by an unsupervised Bayesian model, correlate with major intellectual epochs of his career as identified both by qualitative scholarship and Darwin's own self-commentary. Our methods allow us to compare his consumption of texts with their publication order. We find Darwin's consumption more exploratory than the culture's production, suggesting that underneath gradual societal changes are the explorations of individual synthesis and discovery. Our quantitative methods advance the study of cognitive search through a framework for testing interactions between individual and collective behavior and between short- and long-term consumption choices. This novel application of topic modeling to characterize individual reading complements widespread studies of collective scientific behavior.