Goto

Collaborating Authors

 Uncertainty


Bayesian Approach to Neuro-Rough Models

arXiv.org Artificial Intelligence

This paper proposes a new neuro-rough model for modelling the risk of HIV from demographic data. The model is formulated using Bayesian framework and trained using Markov Chain Monte Carlo method and Metropolis criterion. When the model was tested to estimate the risk of HIV infection given the demographic data it was found to give the accuracy of 62% as opposed to 58% obtained from a Bayesian formulated rough set model trained using Markov chain Monte Carlo method and 62% obtained from a Bayesian formulated multi-layered perceptron (MLP) model trained using hybrid Monte. The proposed model is able to combine the accuracy of the Bayesian MLP model and the transparency of Bayesian rough set model. Keywords: Neuro-rough model, multi-layered perceptron, Bayesian, HIV modelling Introduction The role of machine learning is to be able to make predictions given a set of inputs.


An Algebraic Graphical Model for Decision with Uncertainties, Feasibilities, and Utilities

Journal of Artificial Intelligence Research

Numerous formalisms and dedicated algorithms have been designed in the last decades to model and solve decision making problems. Some formalisms, such as constraint networks, can express "simple" decision problems, while others are designed to take into account uncertainties, unfeasible decisions, and utilities. Even in a single formalism, several variants are often proposed to model different types of uncertainty (probability, possibility...) or utility (additive or not). In this article, we introduce an algebraic graphical model that encompasses a large number of such formalisms: (1) we first adapt previous structures from Friedman, Chu and Halpern for representing uncertainty, utility, and expected utility in order to deal with generic forms of sequential decision making; (2) on these structures, we then introduce composite graphical models that express information via variables linked by "local" functions, thanks to conditional independence; (3) on these graphical models, we finally define a simple class of queries which can represent various scenarios in terms of observabilities and controllabilities. A natural decision-tree semantics for such queries is completed by an equivalent operational semantics, which induces generic algorithms. The proposed framework, called the Plausibility-Feasibility-Utility (PFU) framework, not only provides a better understanding of the links between existing formalisms, but it also covers yet unpublished frameworks (such as possibilistic influence diagrams) and unifies formalisms such as quantified boolean formulas and influence diagrams. Our backtrack and variable elimination generic algorithms are a first step towards unified algorithms.


Learning Probabilistic Models of Word Sense Disambiguation

arXiv.org Artificial Intelligence

This dissertation presents several new methods of supervised and unsupervised learning of word sense disambiguation models. The supervised methods focus on performing model searches through a space of probabilistic models, and the unsupervised methods rely on the use of Gibbs Sampling and the Expectation Maximization (EM) algorithm. In both the supervised and unsupervised case, the Naive Bayesian model is found to perform well. An explanation for this success is presented in terms of learning rates and bias-variance decompositions.


A Generalized Information Formula as the Bridge between Shannon and Popper

arXiv.org Artificial Intelligence

A generalized information formula related to logical probability and fuzzy set is deduced from the classical information formula. The new information measure accords with to Popper's criterion for knowledge evolution very much. In comparison with square error criterion, the information criterion does not only reflect error of a proposition, but also reflects the particularity of the event described by the proposition. It gives a proposition with less logical probability higher evaluation. The paper introduces how to select a prediction or sentence from many for forecasts and language translations according to the generalized information criterion. It also introduces the rate fidelity theory, which comes from the improvement of the rate distortion theory in the classical information theory by replacing distortion (i.e. average error) criterion with the generalized mutual information criterion, for data compression and communication efficiency. Some interesting conclusions are obtained from the rate-fidelity function in relation to image communication. It also discusses how to improve Popper's theory.


Learning Symbolic Models of Stochastic Domains

Journal of Artificial Intelligence Research

In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experiments in simple planning domains and a 3D simulated blocks world with realistic physics, we demonstrate that this learning algorithm allows agents to effectively model world dynamics.


Neutrality and Many-Valued Logics

arXiv.org Artificial Intelligence

In this book, we consider various many-valued logics: standard, linear, hyperbolic, parabolic, non-Archimedean, p-adic, interval, neutrosophic, etc. We survey also results which show the tree different proof-theoretic frameworks for many-valued logics, e.g. frameworks of the following deductive calculi: Hilbert's style, sequent, and hypersequent. We present a general way that allows to construct systematically analytic calculi for a large family of non-Archimedean many-valued logics: hyperrational-valued, hyperreal-valued, and p-adic valued logics characterized by a special format of semantics with an appropriate rejection of Archimedes' axiom. These logics are built as different extensions of standard many-valued logics (namely, Lukasiewicz's, Goedel's, Product, and Post's logics). The informal sense of Archimedes' axiom is that anything can be measured by a ruler. Also logical multiple-validity without Archimedes' axiom consists in that the set of truth values is infinite and it is not well-founded and well-ordered. On the base of non-Archimedean valued logics, we construct non-Archimedean valued interval neutrosophic logic INL by which we can describe neutrality phenomena.


Model Selection Through Sparse Maximum Likelihood Estimation

arXiv.org Artificial Intelligence

We consider the problem of estimating the parameters of a Gaussian or binary distribution in such a way that the resulting undirected graphical model is sparse. Our approach is to solve a maximum likelihood problem with an added l_1-norm penalty term. The problem as formulated is convex but the memory requirements and complexity of existing interior point methods are prohibitive for problems with more than tens of nodes. We present two new algorithms for solving problems with at least a thousand nodes in the Gaussian case. Our first algorithm uses block coordinate descent, and can be interpreted as recursive l_1-norm penalized regression. Our second algorithm, based on Nesterov's first order method, yields a complexity estimate with a better dependence on problem size than existing interior point methods. Using a log determinant relaxation of the log partition function (Wainwright & Jordan (2006)), we show that these same algorithms can be used to solve an approximate sparse maximum likelihood problem for the binary case. We test our algorithms on synthetic data, as well as on gene expression and senate voting records data.


A tutorial on conformal prediction

arXiv.org Machine Learning

Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability $\epsilon$, together with a method that makes a prediction $\hat{y}$ of a label $y$, it produces a set of labels, typically containing $\hat{y}$, that also contains $y$ with probability $1-\epsilon$. Conformal prediction can be applied to any method for producing $\hat{y}$: a nearest-neighbor method, a support-vector machine, ridge regression, etc. Conformal prediction is designed for an on-line setting in which labels are predicted successively, each one being revealed before the next is predicted. The most novel and valuable feature of conformal prediction is that if the successive examples are sampled independently from the same distribution, then the successive predictions will be right $1-\epsilon$ of the time, even though they are based on an accumulating dataset rather than on independent datasets. In addition to the model under which successive examples are sampled independently, other on-line compression models can also use conformal prediction. The widely used Gaussian linear model is one of these. This tutorial presents a self-contained account of the theory of conformal prediction and works through several numerical examples. A more comprehensive treatment of the topic is provided in "Algorithmic Learning in a Random World", by Vladimir Vovk, Alex Gammerman, and Glenn Shafer (Springer, 2005).


Mixed membership stochastic blockmodels

arXiv.org Machine Learning

Observations consisting of measurements on relationships for pairs of objects arise in many settings, such as protein interaction and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilisic models can be delicate because the simple exchangeability assumptions underlying many boilerplate models no longer hold. In this paper, we describe a latent variable model of such data called the mixed membership stochastic blockmodel. This model extends blockmodels for relational data to ones which capture mixed membership latent relational structure, thus providing an object-specific low-dimensional representation. We develop a general variational inference algorithm for fast approximate posterior inference. We explore applications to social and protein interaction networks.


Using Genetic Algorithms to Optimise Rough Set Partition Sizes for HIV Data Analysis

arXiv.org Artificial Intelligence

In this paper, we present a method to optimise rough set partition sizes, to which rule extraction is performed on HIV data. The genetic algorithm optimisation technique is used to determine the partition sizes of a rough set in order to maximise the rough sets prediction accuracy. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey. Six demographic variables were used in the analysis, these variables are; race, age of mother, education, gravidity, parity, and age of father, with the outcome or decision being either HIV positive or negative. Rough set theory is chosen based on the fact that it is easy to interpret the extracted rules. The prediction accuracy of equal width bin partitioning is 57.7% while the accuracy achieved after optimising the partitions is 72.8%. Several other methods have been used to analyse the HIV data and their results are stated and compared to that of rough set theory (RST).