Bayesian Learning
Exponential Family Hybrid Semi-Supervised Learning
Agarwal, Arvind (University of Utah) | III, Hal Daume (University of Utah)
We present an approach to semi-supervised learning based on an exponential family characterization. Our approach generalizes previous work on coupled priors for hybrid generative/discriminative models. Our model is more flexible and natural than previous approaches. Experimental results on several data sets show that our approach also performs better in practice.ย
A New Bayesian Approach to Multiple Intermittent Fault Diagnosis
Abreu, Rui (Delft University of Technology) | Zoeteweij, Peter (Delft University of Technology) | Gemund, Arjan J.C. van (Delft University of Technology)
Logic reasoning approaches to fault diagnosis account for the fact that a component c j may fail intermittently by introducing a parameter g j that expresses the probability the component exhibits correct behavior. This component parameter g j , in conjunction with a priori fault probability, is usedin a Bayesian framework to compute the posterior fault candidate probabilities. Usually, information on g j is not known a priori. While proper estimation of g j can have a great impact on the diagnostic accuracy, at present, only approximations have been proposed. We present a novel framework, BARINEL, that computes exact estimations of g j as integral part of the posterior candidate probability computation. BARINELโs diagnostic performance is evaluated for both synthetic and real software systems. Our results show that our approach is superior to approaches based on classical persistent fault models as well as previously proposed intermittent fault models.
Investigations of Continual Computation
Shahaf, Dafna (Carnegie Mellon) | Horvitz, Eric (Microsoft Research)
Autonomous agents that sense, reason, and act in real-world environments for extended periods often need to solve streams of incoming problems. Traditionally, effort is applied only to problems that have already arrived and have been noted. We examine continual computation methods that allow agents to ideally allocate time to solving current as well as potential future problems under uncertainty. We first review prior work on continual computation. Then, we present new directions and results, including the consideration of shared subtasks and multiple tasks. We present results on the computational complexity of the continual-computation problem and provide approximations for arbitrary models of computational performance. Finally, we review special formulations for addressing uncertainty about the best algorithm to apply, learning about performance, and considering costs associated with delayed use of results.
Preference Functions That Score Rankings and Maximum Likelihood Estimation
Conitzer, Vincent (Duke University) | Rognlie, Matthew (Duke University) | Xia, Lirong (Duke University)
In social choice, a preference function (PF) takes a set of votes (linear orders over a set of alternatives) as input, and produces one or more rankings (also linear orders over the alternatives) as output. Such functions have many applications, for example, aggregating the preferences of multiple agents, or merging rankings (of, say, webpages) into a single ranking. The key issue is choosing a PF to use. One natural and previously studied approach is to assume that there is an unobserved "correct" ranking, and the votes are noisy estimates of this. Then, we can use the PF that always chooses the maximum likelihood estimate (MLE) of the correct ranking. In this paper, we define simple ranking scoring functions (SRSFs) and show that the class of neutral SRSFs is exactly the class of neutral PFs that are MLEs for some noise model. We also define composite ranking scoring functions (CRSFs) and show a condition under which these coincide with SRSFs. We study key properties such as consistency and continuity, and consider some example PFs. In particular, we study Single Transferable Vote (STV), a commonly used PF, showing that it is a CRSF but not an SRSF, thereby clarifying the extent to which it is an MLE function. This also gives a new perspective on how ties should be broken under STV. We leave some open questions.
Activity Recognition: Linking Low-Level Sensors to High-Level Intelligence
Yang, Qiang (Hong Kong Hong Kong University of Science and Technology)
Sensors provide computer systems with a window to the outside world. Activity recognition "sees" what is in the window to predict the locations, trajectories, actions, goals and plans of humans and objects. Building an activity recognition system requires a full range of interaction from statistical inference on lower level sensor data to symbolic AI at higher levels, where prediction results and acquired knowledge are passed up each level to form a knowledge food chain. In this article, I will give an overview of some of the current activity recognition research works and explore a life-cycle of learning and inference that allows the lowest-level radio-frequency signals to be transformed into symbolic logical representations for AI planning, which in turn controls the robots or guides human users through a sensor network, thus completing a full life-cycle of knowledge.
Machine Learning in Ecosystem Informatics and Sustainability
Dietterich, Thomas G. (Oregon State University)
Ecosystem Informatics brings together mathematical and computational tools to address scientific and policy challenges in the ecosystem sciences. These challenges include novel sensors for collecting data, algorithms for automated data cleaning, learning methods for building statistical models from data and for fitting mechanistic models to data, and algorithms for designing optimal policies for biosphere management. This presentation discusses these challenges and then describes recent work on the first two of these--new methods for automated arthropod population counting and linear Gaussian DBNs for automated cleaning of sensor network data.
Intelligent Tutoring Systems: New Challenges and Directions
Conati, Christina (University of British Columbia)
Can we devise educational systems that provide individualized instruction tailored to the needs of the individual learners, as many good teachers do? Intelligent Tutoring Systems is the interdisciplinary field that investigates this question by integrating research in Artificial Intelligence, Cognitive Science and Education. Research in this field has successfully delivered techniques and systems that provide adaptive support for student problem solving in variety of domains. There are, however, other educational activities that can benefit from individualized computer-based support, such as studying examples, exploring interactive simulations and playing educational games. Providing individualized support for these activities rises unique challenges, because it requires that an ITS can model and adapt to student behaviors, skills and mental states often not as structured and well-defined as those involved in traditional problem solving. I will present a variety of projects that illustrate some of these challenges, our proposed solutions, and future opportunities.
Feature Reinforcement Learning: Part I: Unstructured MDPs
General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part II [Hut09c]. The role of POMDPs is also considered there.
Conditional Probability Tree Estimation Analysis and Algorithms
Beygelzimer, Alina, Langford, John, Lifshits, Yuri, Sorkin, Gregory, Strehl, Alex
We consider the problem of estimating the conditional probability of a label in time $O(\log n)$, where $n$ is the number of possible labels. We analyze a natural reduction of this problem to a set of binary regression problems organized in a tree structure, proving a regret bound that scales with the depth of the tree. Motivated by this analysis, we propose the first online algorithm which provably constructs a logarithmic depth tree on the set of labels to solve this problem. We test the algorithm empirically, showing that it works succesfully on a dataset with roughly $10^6$ labels.
A Minimum Description Length Approach to Multitask Feature Selection
Many regression problems involve not one but several response variables (y's). Often the responses are suspected to share a common underlying structure, in which case it may be advantageous to share information across them; this is known as multitask learning. As a special case, we can use multiple responses to better identify shared predictive features -- a project we might call multitask feature selection. This thesis is organized as follows. Section 1 introduces feature selection for regression, focusing on ell_0 regularization methods and their interpretation within a Minimum Description Length (MDL) framework. Section 2 proposes a novel extension of MDL feature selection to the multitask setting. The approach, called the "Multiple Inclusion Criterion" (MIC), is designed to borrow information across regression tasks by more easily selecting features that are associated with multiple responses. We show in experiments on synthetic and real biological data sets that MIC can reduce prediction error in settings where features are at least partially shared across responses. Section 3 surveys hypothesis testing by regression with a single response, focusing on the parallel between the standard Bonferroni correction and an MDL approach. Mirroring the ideas in Section 2, Section 4 proposes a novel MIC approach to hypothesis testing with multiple responses and shows that on synthetic data with significant sharing of features across responses, MIC sometimes outperforms standard FDR-controlling methods in terms of finding true positives for a given level of false positives. Section 5 concludes.