Goto

Collaborating Authors

 Uncertainty


An N Time-Slice Dynamic Chain Event Graph

arXiv.org Machine Learning

The Dynamic Chain Event Graph (DCEG) is able to depict many classes of discrete random processes exhibiting asymmetries in their developments and context-specific conditional probabilities structures. However, paradoxically, this very generality has so far frustrated its wide application. So in this paper we develop an object-oriented method to fully analyse a particularly useful and feasibly implementable new subclass of these graphical models called the N Time-Slice DCEG (NT-DCEG). After demonstrating a close relationship between an NT-DCEG and a specific class of Markov processes, we discuss how graphical modellers can exploit this connection to gain a deep understanding of their processes. We also show how to read from the topology of this graph context-specific independence statements that can then be checked by domain experts. Our methods are illustrated throughout using examples of dynamic multivariate processes describing inmate radicalisation in a prison.


Decision-Making with Belief Functions: a Review

arXiv.org Artificial Intelligence

Approaches to decision-making under uncertainty in the belief function framework are reviewed. Most methods are shown to blend criteria for decision under ignorance with the maximum expected utility principle of Bayesian decision theory. A distinction is made between methods that construct a complete preference relation among acts, and those that allow incomparability of some acts due to lack of information. Methods developed in the imprecise probability framework are applicable in the Dempster-Shafer context and are also reviewed. Shafer's constructive decision theory, which substitutes the notion of goal for that of utility, is described and contrasted with other approaches. The paper ends by pointing out the need to carry out deeper investigation of fundamental issues related to decision-making with belief functions and to assess the descriptive, normative and prescriptive values of the different approaches.


Probabilistic Ensemble of Collaborative Filters

arXiv.org Machine Learning

Collaborative filtering is an important technique for recommendation. Whereas it has been repeatedly shown to be effective in previous work, its performance remains unsatisfactory in many real-world applications, especially those where the items or users are highly diverse. In this paper, we explore an ensemble-based framework to enhance the capability of a recommender in handling diverse data. Specifically, we formulate a probabilistic model which integrates the items, the users, as well as the associations between them into a generative process. On top of this formulation, we further derive a progressive algorithm to construct an ensemble of collaborative filters. In each iteration, a new filter is derived from re-weighted entries and incorporated into the ensemble. It is noteworthy that while the algorithmic procedure of our algorithm is apparently similar to boosting, it is derived from an essentially different formulation and thus differs in several key technical aspects. We tested the proposed method on three large datasets, and observed substantial improvement over the state of the art, including L2Boost, an effective method based on boosting.


Small Sample Learning in Big Data Era

arXiv.org Machine Learning

As a promising area in artificial intelligence, a new learning paradigm, called Small Sample Learning (SSL), has been attracting prominent research attention in the recent years. In this paper, we aim to present a survey to comprehensively introduce the current techniques proposed on this topic. Specifically, current SSL techniques can be mainly divided into two categories. The first category of SSL approaches can be called "concept learning", which emphasizes learning new concepts from only few related observations. The purpose is mainly to simulate human learning behaviors like recognition, generation, imagination, synthesis and analysis. The second category is called "experience learning", which usually co-exists with the large sample learning manner of conventional machine learning. This category mainly focuses on learning with insufficient samples, and can also be called small data learning in some literatures. More extensive surveys on both categories of SSL techniques are introduced and some neuroscience evidences are provided to clarify the rationality of the entire SSL regime, and the relationship with human learning process. Some discussions on the main challenges and possible future research directions along this line are also presented.


Reconciling Irrational Human Behavior with AI based Decision Making: A Quantum Probabilistic Approach

arXiv.org Artificial Intelligence

There are many examples of human decision making which cannot be modeled by classical probabilistic and logic models, on which the current AI systems are based. Hence the need for a modeling framework which can enable intelligent systems to detect and predict cognitive biases in human decisions to facilitate better human-agent interaction. We give a few examples of irrational behavior and use a generalized probabilistic model inspired by the mathematical framework of Quantum Theory to model and explain such behavior.


Analyzing Inverse Problems with Invertible Neural Networks

arXiv.org Machine Learning

In many tasks, in particular in natural science, the goal is to determine hidden system parameters from a set of measurements. Often, the forward process from parameter- to measurement-space is a well-defined function, whereas the inverse problem is ambiguous: one measurement may map to multiple different sets of parameters. In this setting, the posterior parameter distribution, conditioned on an input measurement, has to be determined. We argue that a particular class of neural networks is well suited for this task -- so-called Invertible Neural Networks (INNs). Although INNs are not new, they have, so far, received little attention in literature. While classical neural networks attempt to solve the ambiguous inverse problem directly, INNs are able to learn it jointly with the well-defined forward process, using additional latent output variables to capture the information otherwise lost. Given a specific measurement and sampled latent variables, the inverse pass of the INN provides a full distribution over parameter space. We verify experimentally, on artificial data and real-world problems from astrophysics and medicine, that INNs are a powerful analysis tool to find multi-modalities in parameter space, to uncover parameter correlations, and to identify unrecoverable parameters.


Simple Root Cause Analysis by Separable Likelihoods

arXiv.org Machine Learning

Root Cause Analysis for Anomalies is challenging because of the trade-off between the accuracy and its explanatory friendliness, required for industrial applications. In this paper we propose a framework for simple and friendly RCA within the Bayesian regime under certain restrictions (that Hessian at the mode is diagonal, here referred to as \emph{separability}) imposed on the predictive posterior. We show that this assumption is satisfied for important base models, including Multinomal, Dirichlet-Multinomial and Naive Bayes. To demonstrate the usefulness of the framework, we embed it into the Bayesian Net and validate on web server error logs (real world data set).


A Review of Learning with Deep Generative Models from perspective of graphical modeling

arXiv.org Machine Learning

This document aims to provide a review on learning with deep generative models (DGMs), which is an highly-active area in machine learning and more generally, artificial intelligence. This review is not meant to be a tutorial, but when necessary, we provide self-contained derivations for completeness. This review has two features. First, though there are different perspectives to classify DGMs, we choose to organize this review from the perspective of graphical modeling, because the learning methods for directed DGMs and undirected DGMs are fundamentally different. Second, we differentiate model definitions from model learning algorithms, since different learning algorithms can be applied to solve the learning problem on the same model, and an algorithm can be applied to learn different models. We thus separate model definition and model learning, with more emphasis on reviewing, differentiating and connecting different learning algorithms. We also discuss promising future research directions. This review is by no means comprehensive as the field is evolving rapidly. The authors apologize in advance for any missed papers and inaccuracies in descriptions. Corrections and comments are highly welcome.


Plithogeny, Plithogenic Set, Logic, Probability, and Statistics

arXiv.org Artificial Intelligence

In this book we introduce the plithogenic set (as generalization of crisp, fuzzy, intuitionistic fuzzy, and neutrosophic sets), plithogenic logic (as generalization of classical, fuzzy, intuitionistic fuzzy, and neutrosophic logics), plithogenic probability (as generalization of classical, imprecise, and neutrosophic probabilities), and plithogenic statistics (as generalization of classical, and neutrosophic statistics). Plithogenic Set is a set whose elements are characterized by one or more attributes, and each attribute may have many values. An attribute value v has a corresponding (fuzzy, intuitionistic fuzzy, or neutrosophic) degree of appurtenance d(x,v) of the element x, to the set P, with respect to some given criteria. In order to obtain a better accuracy for the plithogenic aggregation operators in the plithogenic set, logic, probability and for a more exact inclusion (partial order), a (fuzzy, intuitionistic fuzzy, or neutrosophic) contradiction (dissimilarity) degree is defined between each attribute value and the dominant (most important) attribute value. The plithogenic intersection and union are linear combinations of the fuzzy operators tnorm and tconorm, while the plithogenic complement, inclusion, equality are influenced by the attribute values contradiction (dissimilarity) degrees. Formal definitions of plithogenic set, logic, probability, statistics are presented into the book, followed by plithogenic aggregation operators, various theorems related to them, and afterwards examples and applications of these new concepts in our everyday life.


How Complex is your classification problem? A survey on measuring classification complexity

arXiv.org Machine Learning

Extracting characteristics from the training datasets of classification problems has proven effective in a number of meta-analyses. Among them, measures of classification complexity can estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the existent measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenging characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available.