Directed Networks
Anytime Induction of Low-cost, Low-error Classifiers: a Sampling-based Approach
Machine learning techniques are gaining prevalence in the production of a wide range of classifiers for complex real-world applications with nonuniform testing and misclassification costs. The increasing complexity of these applications poses a real challenge to resource management during learning and classification. In this work we introduce ACT (anytime cost-sensitive tree learner), a novel framework for operating in such complex environments. ACT is an anytime algorithm that allows learning time to be increased in return for lower classification costs. It builds a tree top-down and exploits additional time resources to obtain better estimations for the utility of the different candidate splits. Using sampling techniques, ACT approximates the cost of the subtree under each candidate split and favors the one with a minimal cost. As a stochastic algorithm, ACT is expected to be able to escape local minima, into which greedy methods may be trapped. Experiments with a variety of datasets were conducted to compare ACT to the state-of-the-art cost-sensitive tree learners. The results show that for the majority of domains ACT produces significantly less costly trees. ACT also exhibits good anytime behavior with diminishing returns.
Finding links and initiators: a graph reconstruction problem
Mannila, Heikki, Terzi, Evimaria
Analyzing 0-1 matrices is one of the main themes in data mining. Techniques such as clustering or mixture modelling, matrix decomposition techniques such as PCA, ICA, and NMR, and Bayesian all aim to give an answer to the informal question: "Where does the matrix come from?" These approaches aim at describing a probabilistic generative model that describes the observed matrix well. In this paper we consider yet another way of answering the question "Where does a 0-1 matrix M come from?" In our model, the matrix M of size n m is considered to arise from initiators, certain few entries that are initially 1. The initiators propagate their 1's by following the links of a directed influence graph G (represented by an n n adjacency matrix). We denote the initiator matrix of size n m by N and we use G (of size n n) to refer both to the directed graph between the rows of M and as well as its adjacency matrix. Then, we believe that the structure of N and G can tell how a matrix M has been created.
Latent Tree Models and Approximate Inference in Bayesian Networks
Wang, Y., Zhang, N. L., Chen, T.
We propose a novel method for approximate inference in Bayesian networks (BNs). The idea is to sample data from a BN, learn a latent tree model (LTM) from the data offline, and when online, make inference with the LTM instead of the original BN. Because LTMs are tree-structured, inference takes linear time. In the meantime, they can represent complex relationship among leaf nodes and hence the approximation accuracy is often good. Empirical evidence shows that our method can achieve good approximation accuracy at low online computational cost.
Uncertainty quantification in complex systems using approximate solvers
Koutsourelakis, Phaedon-Stelios
This paper proposes a novel uncertainty quantification framework for computationally demanding systems characterized by a large vector of non-Gaussian uncertainties. It combines state-of-the-art techniques in advanced Monte Carlo sampling with Bayesian formulations. The key departure from existing works is the use of inexpensive, approximate computational models in a rigorous manner. Such models can readily be derived by coarsening the discretization size in the solution of the governing PDEs, increasing the time step when integration of ODEs is performed, using fewer iterations if a non-linear solver is employed or making use of lower order models. It is shown that even in cases where the inexact models provide very poor approximations of the exact response, statistics of the latter can be quantified accurately with significant reductions in the computational effort. Multiple approximate models can be used and rigorous confidence bounds of the estimates produced are provided at all stages.
Relations among conditional probabilities
We describe a Groebner basis of relations among conditional probabilities in a discrete probability space, with any set of conditioned-upon events. They may be specialized to the partially-observed random variable case, the purely conditional case, and other special cases. We also investigate the connection to generalized permutohedra and describe a conditional probability simplex.
Use of a Quantum Computer and the Quick Medical Reference To Give an Approximate Diagnosis
The Quick Medical Reference (QMR) is a compendium of statistical knowledge connecting diseases to findings (symptoms). The information in QMR can be represented as a Bayesian network. The inference problem (or, in more medical language, giving a diagnosis) for the QMR is to, given some findings, find the probability of each disease. Rejection sampling and likelihood weighted sampling (a.k.a. likelihood weighting) are two simple algorithms for making approximate inferences from an arbitrary Bayesian net (and from the QMR Bayesian net in particular). Heretofore, the samples for these two algorithms have been obtained with a conventional "classical computer". In this paper, we will show that two analogous algorithms exist for the QMR Bayesian net, where the samples are obtained with a quantum computer. We expect that these two algorithms, implemented on a quantum computer, can also be used to make inferences (and predictions) with other Bayesian nets.
Text Data Mining: Theory and Methods
This paper provides the reader with a very brief introduction to some of the theory and methods of text data mining. The intent of this article is to introduce the reader to some of the current methodologies that are employed within this discipline area while at the same time making the reader aware of some of the interesting challenges that remain to be solved within the area. Finally, the articles serves as a very rudimentary tutorial on some of techniques while also providing the reader with a list of references for additional study.
Catching Up Faster by Switching Sooner: A Prequential Solution to the AIC-BIC Dilemma
van Erven, Tim, Grunwald, Peter, de Rooij, Steven
Bayesian model averaging, model selection and its approximations such as BIC are generally statistically consistent, but sometimes achieve slower rates og convergence than other methods such as AIC and leave-one-out cross-validation. On the other hand, these other methods can br inconsistent. We identify the "catch-up phenomenon" as a novel explanation for the slow convergence of Bayesian methods. Based on this analysis we define the switch distribution, a modification of the Bayesian marginal distribution. We show that, under broad conditions,model selection and prediction based on the switch distribution is both consistent and achieves optimal convergence rates, thereby resolving the AIC-BIC dilemma. The method is practical; we give an efficient implementation. The switch distribution has a data compression interpretation, and can thus be viewed as a "prequential" or MDL method; yet it is different from the MDL methods that are usually considered in the literature. We compare the switch distribution to Bayes factor model selection and leave-one-out cross-validation.
A Bayesian Approach to Network Modularity
Hofman, Jake M., Wiggins, Chris H.
We present an efficient, principled, and interpretable technique for inferring module assignments and for identifying the optimal number of modules in a given network. We show how several existing methods for finding modules can be described as variant, special, or limiting cases of our work, and how the method overcomes the resolution limit problem, accurately recovering the true number of modules. Our approach is based on Bayesian methods for model selection which have been used with success for almost a century, implemented using a variational technique developed only in the past decade. We apply the technique to synthetic and real networks and outline how the method naturally allows selection among competing models.
Conditioning Probabilistic Databases
Past research on probabilistic databases has studied the problem of answering queries on a static database. Application scenarios of probabilistic databases however often involve the conditioning of a database using additional information in the form of new evidence. The conditioning problem is thus to transform a probabilistic database of priors into a posterior probabilistic database which is materialized for subsequent query processing or further refinement. It turns out that the conditioning problem is closely related to the problem of computing exact tuple confidence values. It is known that exact confidence computation is an NP-hard problem. This has led researchers to consider approximation techniques for confidence computation. However, neither conditioning nor exact confidence computation can be solved using such techniques. In this paper we present efficient techniques for both problems. We study several problem decomposition methods and heuristics that are based on the most successful search techniques from constraint satisfaction, such as the Davis-Putnam algorithm. We complement this with a thorough experimental evaluation of the algorithms proposed. Our experiments show that our exact algorithms scale well to realistic database sizes and can in some scenarios compete with the most efficient previous approximation algorithms.