Goto

Collaborating Authors

 Bayesian Inference


Sum-Product Graphical Models

arXiv.org Machine Learning

This paper introduces a new probabilistic architecture called Sum-Product Graphical Model (SPGM). SPGMs combine traits from Sum-Product Networks (SPNs) and Graphical Models (GMs): Like SPNs, SPGMs always enable tractable inference using a class of models that incorporate context specific independence. Like GMs, SPGMs provide a high-level model interpretation in terms of conditional independence assumptions and corresponding factorizations. Thus, the new architecture represents a class of probability distributions that combines, for the first time, the semantics of graphical models with the evaluation efficiency of SPNs. We also propose a novel algorithm for learning both the structure and the parameters of SPGMs. A comparative empirical evaluation demonstrates competitive performances of our approach in density estimation.


Bayesian Network Learning via Topological Order

arXiv.org Machine Learning

We propose a mixed integer programming (MIP) model and iterative algorithms based on topological orders to solve optimization problems with acyclic constraints on a directed graph. The proposed MIP model has a significantly lower number of constraints compared to popular MIP models based on cycle elimination constraints and triangular inequalities. The proposed iterative algorithms use gradient descent and iterative reordering approaches, respectively, for searching topological orders. A computational experiment is presented for the Gaussian Bayesian network learning problem, an optimization problem minimizing the sum of squared errors of regression models with L1 penalty over a feature network with application of gene network inference in bioinformatics.


Probabilistic Reasoning with Abstract Argumentation Frameworks

Journal of Artificial Intelligence Research

Abstract argumentation offers an appealing way of representing and evaluating arguments and counterarguments. This approach can be enhanced by considering probability assignments on arguments, allowing for a quantitative treatment of formal argumentation. In this paper, we regard the assignment as denoting the degree of belief that an agent has in an argument being acceptable. While there are various interpretations of this, an example is how it could be applied to a deductive argument. Here, the degree of belief that an agent has in an argument being acceptable is a combination of the degree to which it believes the premises, the claim, and the derivation of the claim from the premises. We consider constraints on these probability assignments, inspired by crisp notions from classical abstract argumentation frameworks and discuss the issue of probabilistic reasoning with abstract argumentation frameworks. Moreover, we consider the scenario when assessments on the probabilities of a subset of the arguments are given and the probabilities of the remaining arguments have to be derived, taking both the topology of the argumentation framework and principles of probabilistic reasoning into account. We generalise this scenario by also considering inconsistent assessments, i.e., assessments that contradict the topology of the argumentation framework. Building on approaches to inconsistency measurement, we present a general framework to measure the amount of conflict of these assessments and provide a method for inconsistency-tolerant reasoning.


AI โ€“ The Present in the Making

#artificialintelligence

I attended the Huawei European Innovation Day recently, and was enthralled by how the new technology is giving rise to industrial revolutions. These revolutions are what will eventually unlock the development potential around the world. It is important to leverage the emerging technologies, since they are the resources which will lead us to innovation and progress. Huawei is innovative in its partnerships and collaboration to define the future, and the event was a huge success. For many people, the concept of Artificial Intelligence (AI) is a thing of the future. It is the technology that has yet to be introduced.


Natural Language Processing: State of The Art, Current Trends and Challenges

arXiv.org Artificial Intelligence

Natural language processing (NLP) has recently gained much attention for representing and analysing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. The paper distinguishes four phases by discussing different levels of NLP and components of Natural Language Generation (NLG) followed by presenting the history and evolution of NLP, state of the art presenting the various applications of NLP and current trends and challenges.


Pseudo-extended Markov chain Monte Carlo

arXiv.org Machine Learning

Sampling from the posterior distribution using Markov chain Monte Carlo (MCMC) methods can require an exhaustive number of iterations to fully explore the correct posterior. This is often the case when the posterior of interest is multi-modal, as the MCMC sampler can become trapped in a local mode for a large number of iterations. In this paper, we introduce the pseudo-extended MCMC method as an approach for improving the mixing of the MCMC sampler in complex posterior distributions. The pseudo-extended method augments the state-space of the posterior using pseudo-samples as auxiliary variables, where on the extended space, the MCMC sampler is able to easily move between the well-separated modes of the posterior. We apply the pseudo-extended method within an Hamiltonian Monte Carlo sampler and show that by using the No U-turn algorithm (Hoffman and Gelman, 2014), our proposed sampler is completely tuning free. We compare the pseudo-extended method against well-known tempered MCMC algorithms and show the advantages of the new sampler on a number of challenging examples from the statistics literature.


Information-based inference for singular models and finite sample sizes

arXiv.org Machine Learning

A central problem in statistics is model selection, the choice between competing models of a stochastic process whose observables are corrupted by noise. In the information-based paradigm of inference, model selection is performed by estimating the predictive performance of the com- peting models. The candidate model with the best estimated predictive performance is selected. Information-based inference is dependent on the accuracy of the estimate of the predictive complexity, a measure of the flexibility of the model in fitting the data. A large-sample-size approximation for the performance is the Akaike Information Criterion (AIC). The AIC approximation fails in a wide range of important applications, either significantly under or over-estimating the complexity. We introduce an improved approximation for the complexity which we use to define a new information criterion: the frequentist information criterion (FIC). FIC extends the applicability of information-based infer- ence to the finite-sample-size regime of regular models and to singular models. We demonstrate the power of the approach in a number of example problems.


Frequentist coverage and sup-norm convergence rate in Gaussian process regression

arXiv.org Machine Learning

Gaussian process (GP) regression is a powerful interpolation technique due to its flexibility in capturing non-linearity. In this paper, we provide a general framework for understanding the frequentist coverage of point-wise and simultaneous Bayesian credible sets in GP regression. As an intermediate result, we develop a Bernstein von-Mises type result under supremum norm in random design GP regression. Identifying both the mean and covariance function of the posterior distribution of the Gaussian process as regularized $M$-estimators, we show that the sampling distribution of the posterior mean function and the centered posterior distribution can be respectively approximated by two population level GPs. By developing a comparison inequality between two GPs, we provide exact characterization of frequentist coverage probabilities of Bayesian point-wise credible intervals and simultaneous credible bands of the regression function. Our results show that inference based on GP regression tends to be conservative; when the prior is under-smoothed, the resulting credible intervals and bands have minimax-optimal sizes, with their frequentist coverage converging to a non-degenerate value between their nominal level and one. As a byproduct of our theory, we show that the GP regression also yields minimax-optimal posterior contraction rate relative to the supremum norm, which provides a positive evidence to the long standing problem on optimal supremum norm contraction rate in GP regression.


Sparse Partially Collapsed MCMC for Parallel Inference in Topic Models

arXiv.org Machine Learning

Topic models, and more specifically the class of Latent Dirichlet Allocation (LDA), are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models on five well-known text corpora of differing sizes and properties. In particular, we propose and compare two different strategies for sampling the parameter block with latent topic indicators. The experiments show that the increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed, and can be more than compensated by the speedup from parallelization and sparsity on larger corpora. We also prove that the partially collapsed samplers scale well with the size of the corpus. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler.


Machine Learning for Survival Analysis: A Survey

arXiv.org Machine Learning

Accurately predicting the time of occurrence of an event of interest is a critical problem in longitudinal data analysis. One of the main challenges in this context is the presence of instances whose event outcomes become unobservable after a certain time point or when some instances do not experience any event during the monitoring period. Such a phenomenon is called censoring which can be effectively handled using survival analysis techniques. Traditionally, statistical approaches have been widely developed in the literature to overcome this censoring issue. In addition, many machine learning algorithms are adapted to effectively handle survival data and tackle other challenging problems that arise in real-world data. In this survey, we provide a comprehensive and structured review of the representative statistical methods along with the machine learning techniques used in survival analysis and provide a detailed taxonomy of the existing methods. We also discuss several topics that are closely related to survival analysis and illustrate several successful applications in various real-world application domains. We hope that this paper will provide a more thorough understanding of the recent advances in survival analysis and offer some guidelines on applying these approaches to solve new problems that arise in applications with censored data.