Uncertainty
Leveraging Domain Knowledge in Multitask Bayesian Network Structure Learning
Oyen, Diane (University of New Mexico) | Lane, Terran (University of New Mexico)
Network structure learning algorithms have aided network discovery in fields such as bioinformatics, neuroscience, ecology and social science. However, challenges remain in learning informative networks for related sets of tasks because the search space of Bayesian network structures is characterized by large basins of approximately equivalent solutions. Multitask algorithms select a set of networks that are near each other in the search space, rather than a score-equivalent set of networks chosen from independent regions of the space. This selection preference allows a domain expert to see only differences supported by the data. However, the usefulness of these algorithms for scientific datasets is limited because existing algorithms naively assume that all pairs of tasks are equally related. We introduce a framework that relaxes this assumption by incorporating domain knowledge about task-relatedness into the learning objective. Using our framework, we introduce the first multitask Bayesian network algorithm that leverages domain knowledge about the relatedness of tasks. We use our algorithm to explore the effect of task-relatedness on network discovery and show that our algorithm learns networks that are closer to ground truth than naive algorithms and that our algorithm discovers patterns that are interesting.
Sparse Probabilistic Relational Projection
Li, Wu-Jun (Shanghai Jiao Tong University) | Yeung, Dit-Yan (Hong Kong University of Science and Technology)
Probabilistic relational PCA (PRPCA) can learn a projection matrix to perform dimensionality reduction for relational data. However, the results learned by PRPCA lack interpretability because each principal component is a linear combination of all the original variables. In this paper, we propose a novel model, called sparse probabilistic relational projection (SPRP), to learn a sparse projection matrix for relational dimensionality reduction. The sparsity in SPRP is achieved by imposing on the projection matrix a sparsity-inducing prior such as the Laplace prior or Jeffreys prior. We propose an expectation-maximization (EM) algorithm to learn the parameters of SPRP. Compared with PRPCA, the sparsity in SPRP not only makes the results more interpretable but also makes the projection operation much more efficient without compromising its accuracy. All these are verified by experiments conducted on several real applications.
Probabilistic Models for Common Spatial Patterns: Parameter-Expanded EM and Variational Bayes
Kang, Hyohyeong (Pohang University of Science and Technology) | Choi, Seungjin (Pohang University of Science and Technology)
Common spatial patterns (CSP) is a popular feature extraction method for discriminating between positive andnegative classes in electroencephalography (EEG) data.Two probabilistic models for CSP were recently developed: probabilistic CSP (PCSP), which is trained by expectation maximization (EM), and variational BayesianCSP (VBCSP) which is learned by variational approx-imation. Parameter expansion methods use auxiliaryparameters to speed up the convergence of EM or thedeterministic approximation of the target distributionin variational inference. In this paper, we describethe development of parameter-expanded algorithms forPCSP and VBCSP, leading to PCSP-PX and VBCSP-PX, whose convergence speed-up and high performanceare emphasized. The convergence speed-up in PCSP-PX and VBCSP-PX is a direct consequence of parame-ter expansion methods. The contribution of this study is the performance improvement in the case of CSP,which is a novel development. Numerical experimentson the BCI competition datasets, III IV a and IV 2ademonstrate the high performance and fast convergenceof PCSP-PX and VBCSP-PX, as compared to PCSP andVBCSP.
A Bayesian Approach to the Data Description Problem
Ghasemi, Alireza (Ecole Polytechnique Federale de Lausanne (EPFL)) | Rabiee, Hamid R. (Sharif University of Technology) | Manzuri, Mohammad Taghi (Sharif University of Technology) | Rohban, Mohammad Hossein (Sharif University of Technology)
In this paper, we address the problem of data description using a Bayesian framework. The goal of data description is to draw a boundary around objects of a certain class of interest to discriminate that class from the rest of the feature space. Data description is also known as one-class learning and has a wide range of applications. The proposed approach uses a Bayesian framework to precisely compute the class boundary and therefore can utilize domain information in form of prior knowledge in the framework. It can also operate in the kernel space and therefore recognize arbitrary boundary shapes. Moreover, the proposed method can utilize unlabeled data in order to improve accuracy of discrimination. We evaluate our method using various real-world datasets and compare it with other state of the art approaches of data description. Experiments show promising results and improved performance over other data description and one-class learning algorithms.
Clustering Documents Along Multiple Dimensions
Dasgupta, Sajib (IBM Almaden Research Center) | Golden, Richard M. (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)
Traditional clustering algorithms are designed to search for a single clustering solution despite the fact that multiple alternative solutions might exist for a particular dataset. For example, a set of news articles might be clustered by topic or by the author's gender or age. Similarly, book reviews might be clustered by sentiment or comprehensiveness. In this paper, we address the problem of identifying alternative clustering solutions by developing a Probabilistic Multi-Clustering (PMC) model that discovers multiple, maximally different clusterings of a data sample. Empirical results on six datasets representative of real-world applications show that our PMC model exhibits superior performance to comparable multi-clustering algorithms.
On Finding Optimal Polytrees
Gaspers, Serge (Vienna University of Technology) | Koivisto, Mikko (University of Helsinki) | Liedloff, Mathieu (Université d'Orlèans) | Ordyniak, Sebastian (Vienna University of Technology) | Szeider, Stefan (Vienna University of Technology)
Inferring probabilistic networks from data is a notoriously difficult task. Under various goodness-of-fit measures, finding an optimal network is NP-hard, even if restricted to polytrees of bounded in-degree. Polynomial-time algorithms are known only for rare special cases, perhaps most notably for branchings, that is, polytrees in which the in-degree of every node is at most one. Here, we study the complexity of finding an optimal polytree that can be turned into a branching by deleting some number of arcs or nodes, treated as a parameter. We show that the problem can be solved via a matroid intersection formulation in polynomial time if the number of deleted arcs is bounded by a constant. The order of the polynomial time bound depends on this constant, hence the algorithm does not establish fixed-parameter tractability when parameterized by the number of deleted arcs. We show that a restricted version of the problem allows fixed-parameter tractability and hence scales well with the parameter. We contrast this positive result by showing that if we parameterize by the number of deleted nodes, a somewhat more powerful parameter, the problem is not fixed-parameter tractable, subject to a complexity-theoretic assumption.
Factored Models for Multiscale Decision-Making in Smart Grid Customers
Reddy, Prashant P. (Carnegie Mellon University) | Veloso, Manuela M. (Carnegie Mellon University)
Active participation of customers in the management of demand, and renewable energy supply, is a critical goal of the Smart Grid vision. However, this is a complex problem with numerous scenarios that are difficult to test in field projects. Rich and scalable simulations are required to develop effective strategies and policies that elicit desirable behavior from customers. We present a versatile agent-based "factored model" that enables rich simulation scenarios across distinct customer types and varying agent granularity. We formally characterize the decisions to be made by Smart Grid customers as a multiscale decision-making problem and show how our factored model representation handles several temporal and contextual decisions by introducing a novel "utility optimizing agent." We further contribute innovative algorithms for (i) statistical learning-based hierarchical Bayesian timeseries simulation, and (ii) adaptive capacity control using decision-theoretic approximation of multiattribute utility functions over multiple agents. Prominent among the approaches being studied to achieve active customer participation is one based on offering customers financial incentives through variable-price tariffs; we also contribute an effective solution to the problem of "customer herding" under such tariffs. We support our contributions with experimental results from simulations based on real-world data on an open Smart Grid simulation platform.
Coupling Spatiotemporal Disease Modeling with Diagnosis
Mubangizi, Martin Gordon (Makerere University) | Ikae, Caterine (Makerere University) | Spiliopoulou, Athina (University of Edinburgh) | Quinn, John A. (Makerere University)
Modelling the density of an infectious disease in space and time is a task generally carried out separately from the diagnosis of that disease in individuals. These two inference problems are complementary, however: diagnosis of disease can be done more accurately if prior information from a spatial risk model is employed, and in turn a disease density model can benefit from the incorporation of rich symptomatic information rather than simple counts of presumed cases of infection. We propose a unifying framework for both of these tasks, and illustrate it with the case of malaria. To do this we first introduce a state space model of malaria spread, and secondly a computer vision based system for detecting plasmodium in microscopical blood smear images, which can be run on location-aware mobile devices. We demonstrate the tractability of combining both elements and the improvement in accuracy this brings about.
Global Climate Model Tracking Using Geospatial Neighborhoods
McQuade, Scott (The George Washington University) | Monteleoni, Claire (The George Washington University)
A key problem in climate science is how to combine the predictions of the multi-model ensemble of global climate models. Recent work in machine learning (Monteleoni et al. 2011) showed the promise of an algorithm for online learning with experts for this task.We extend the Tracking Climate Models (TCM) approach to (1) take into account climate model predictions at higher spatial resolutions and (2) to model geospatial neighborhood influence between regions. Our algorithm enables neighborhood influence by modifying the transition dynamics of the Hidden Markov Model used by TCM, allowing the performance of spatial neighbors to influence the temporal switching probabilities for the best expert (climate model) at a given location. In experiments on historical data at a variety of spatial resolutions, our algorithm demonstrates improvements over TCM, when tracking global temperature anomalies.
A Mouse-Trajectory Based Model for Predicting Query-URL Relevance
Hengjie, Song ( Nanyang Technological University Baidu Inc. ) | Liao, Ruoxue (Baidu Inc.) | Zhang, Xiangliang (King Abdullah University of Science and Technology) | Miao, Chunyan (Nanyang Technological University) | Yang, Qiang (Hong Kong University of Science and Technology)
For the learning-to-ranking algorithms used in commercial search engines, a conventional way to generate the training examples is to employ professional annotators to label the relevance of query-url pairs. Since label quality depends on the expertise of annotators to a large extent, this process is time-consuming and labor-intensive. Automatically generating labels from click-through data has been well studied to have comparable or better performance than human judges. Click-through data present users’ action and imply their satisfaction on search results, but exclude the interactions between users and search results beyond the page-view level (e.g., eye and mouse movements). This paper proposes a novel approach to comprehensively consider the information underlying mouse trajectory and click-through data so as to describe user behaviors more objectively and achieve a better understanding of the user experience. By integrating multi-sources data, the proposed approach reveals that the relevance labels of query-url pairs are related to positions of urls and users’ behavioral features. Based on their correlations, query-url pairs can be labeled more accurately and search results are more satisfactory to users. The experiments that are conducted on the most popular Chinese commercial search engine (Baidu) validated the rationality of our research motivation and proved that the proposed approach outperformed the state-of-the-art methods.