Goto

Collaborating Authors

 Uncertainty


Scalable Discrete Sampling as a Multi-Armed Bandit Problem

arXiv.org Machine Learning

Drawing a sample from a discrete distribution is one of the building components for Monte Carlo methods. Like other sampling algorithms, discrete sampling suffers from the high computational burden in large-scale inference problems. We study the problem of sampling a discrete random variable with a high degree of dependency that is typical in large-scale Bayesian inference and graphical models, and propose an efficient approximate solution with a subsampling approach. We make a novel connection between the discrete sampling and Multi-Armed Bandits problems with a finite reward population and provide three algorithms with theoretical guarantees. Empirical evaluations show the robustness and efficiency of the approximate algorithms in both synthetic and real-world large-scale problems.


Probabilistic Graphical Models on Multi-Core CPUs using Java 8

arXiv.org Artificial Intelligence

In this paper, we discuss software design issues related to the development of parallel computational intelligence algorithms on multi-core CPUs, using the new Java 8 functional programming features. In particular, we focus on probabilistic graphical models (PGMs) and present the parallelisation of a collection of algorithms that deal with inference and learning of PGMs from data. Namely, maximum likelihood estimation, importance sampling, and greedy search for solving combinatorial optimisation problems. Through these concrete examples, we tackle the problem of defining efficient data structures for PGMs and parallel processing of same-size batches of data sets using Java 8 features. We also provide straightforward techniques to code parallel algorithms that seamlessly exploit multi-core processors. The experimental analysis, carried out using our open source AMIDST (Analysis of MassIve Data STreams) Java toolbox, shows the merits of the proposed solutions.


5 skills You Need to Become a Machine Learning Engineer

#artificialintelligence

The world is unquestionably changing in rapid and dramatic ways, and the demand for Machine Learning engineers is going to keep increasing exponentially. Now undoubtedly Machine Learning has arrived. To begin, there are two very important things that you should understand if you're considering a career as a Machine Learning engineer. You don't necessarily have to have a research or academic background. Second, it's not enough to have either software engineering or data science experience.


Sparse group factor analysis for biclustering of multiple data sources

arXiv.org Machine Learning

Motivation: Modelling methods that find structure in data are necessary with the current large volumes of genomic data, and there have been various efforts to find subsets of genes exhibiting consistent patterns over subsets of treatments. These biclustering techniques have focused on one data source, often gene expression data. We present a Bayesian approach for joint biclustering of multiple data sources, extending a recent method Group Factor Analysis (GFA) to have a biclustering interpretation with additional sparsity assumptions. The resulting method enables data-driven detection of linear structure present in parts of the data sources. Results: Our simulation studies show that the proposed method reliably infers bi-clusters from heterogeneous data sources. We tested the method on data from the NCI-DREAM drug sensitivity prediction challenge, resulting in an excellent prediction accuracy. Moreover, the predictions are based on several biclusters which provide insight into the data sources, in this case on gene expression, DNA methylation, protein abundance, exome sequence, functional connectivity fingerprints and drug sensitivity.


Bayes' Theorem And Robot Arms Open Data Science Conferences

#artificialintelligence

If you enjoyed Jesse's presentation at ODSC's last Boston Big Data Conference come to ODSC East this May to hear out his colleagues. Rather than start with the statement of Bayes' Theorem, I want to use an old math teacher trick (which I realize many students hate) of trying to derive it from scratch, without stating what we're trying to derive. Rather, we'll start by modifying a problem that I described in an earlier post on probability distributions1. Bayes' gives you a way of determining the probability that a given event will occur, or that a given condition is true, given your knowledge of another related event or condition. All the examples that I've read or heard about seemed somewhat contrived and unrelated to the sorts of data analysis I was interested in.


Temporal Topic Analysis with Endogenous and Exogenous Processes

AAAI Conferences

We consider the problem of modeling temporal textual data taking endogenous and exogenous processes into account. Such text documents arise in real world applications, including job advertisements and economic news articles, which are influenced by the fluctuations of the general economy. We propose a hierarchical Bayesian topic model which imposes a "group-correlated" hierarchical structure on the evolution of topics over time incorporating both processes, and show that this model can be estimated from Markov chain Monte Carlo sampling methods. We further demonstrate that this model captures the intrinsic relationships between the topic distribution and the time-dependent factors, and compare its performance with latent Dirichlet allocation (LDA) and two other related models. The model is applied to two collections of documents to illustrate its empirical performance: online job advertisements from DirectEmployers Association and journalists' postings on BusinessInsider.com.


Knowledge Compilation for Lifted Probabilistic Inference: Compiling to a Low-Level Language

AAAI Conferences

Algorithms based on first-order knowledge compilation are currently the state-of-the-art for lifted inference. These algorithms typically compile a probabilistic relational model into an intermediate data structure and use it to answer many inference queries. In this paper, we propose compiling a probabilistic relational model directly into a low-level target (e.g., C or C++) program instead of an intermediate data structure and taking advantage of advances in program compilation. Our experiments represent orders of magnitude speedup compared to existing approaches.


Negation Without Negation in Probabilistic Logic Programming

AAAI Conferences

Probabilistic logic programs without negation can have cycles (with a preference for false), but cannot represent all conditional distributions. Probabilistic logic programs with negation can represent arbitrary conditional probabilities, but with cycles they create logical inconsistencies. We show how allowing negative noise probabilities allows us to represent arbitrary conditional probabilities without negations. Noise probabilities for non-exclusive rules are difficult to interpret and unintuitive to manipulate; to alleviate this we define ``probability-strengths'' which provide an intuitive additive algebra for combining rules. For acyclic programs we prove what constraints on the strengths allow for proper distributions on the non-noise variables and allow for all non-extreme distributions to be represented. We show how arbitrary CPDs can be converted into this form in a canonical way. Furthermore, if a joint distribution can be compactly represented by a cyclic program with negations, we show how it can also be compactly represented with negative noise probabilities and no negations. This allows algorithms for exact inference that do not support negations to be applicable to probabilistic logic programs with negations.


Probabilistic Models over Weighted Orderings: Fixed-Parameter Tractable Variable Elimination

AAAI Conferences

Probabilistic models with weighted formulas, known as Markov models or log-linear models, are used in many domains. Recent models of weighted orderings between elements that have been proposed as flexible tools to express preferences under uncertainty, are also potentially useful in applications like planning, temporal reasoning, and user modeling. Their computational properties are very different from those of conventional Markov models; because of the transitivity of the “less than” relation, standard methods that exploit structure of the models, such as variable elimination, are not directly applicable, as there are no conditional independencies between the orderings within connected components. The best known algorithms for general inference inthese models are exponential in the number of statements. Here, we present the first algorithms that exploit the available structure. We begin with the special case of models in the form of chains; we present an exact O(n^3) algorithm, where n is the total number of elements. Next, we generalize this technique to models in which the set of statements are comprised of arbitrary sets of atomic weighted preference formulas (while the query and evidence are conjunctions of atomic preference formulas), and the resulting exact algorithm runs in time O(m * n^2 * n^c), where m is the number of preference formulas, n is the number of elements, and c is the maximum number of elements in a linear cut (which depends both on the structure of the model and the order in which the elements are processed)—therefore, this algorithm is tractable for cases in which c can be bounded to a low value. Finally, we report on the results of an empirical evaluation of both algorithms, showing how they scale with reasonably-sized models.


Bayesian Deduction with Subjective Opinions

AAAI Conferences

Subjective opinions can represent uncertain probabilistic information of any kind, minor or major A Bayesian network (BN) is a compact representation of a imprecision and even total ignorance about the probability joint probability distribution in the form of a directed acyclic distribution, by varying the uncertainty mass between 0 and graph (DAG) with random variables as nodes, and a set 1. By simply substituting every input conditional probability of conditional probability distributions associated with each distribution in a BN with a subjective opinion, we obtain node representing the probabilistic connection of the node what we call a subjective Bayesian network.