Goto

Collaborating Authors

 Directed Networks


Top Data Mining Algorithms Identified by IEEE & Related Python Resources

@machinelearnbot

IEEE International Conference on Data Mining identified 10 algorithms in 2006 using surveys from past winners and voting. This is a list of those algorithms a short description and related python resources. The detailed paper is given here. C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier.


Exploiting Causality for Selective Belief Filtering in Dynamic Bayesian Networks

arXiv.org Artificial Intelligence

Dynamic Bayesian networks (DBNs) are a general model for stochastic processes with partially observed states. Belief filtering in DBNs is the task of inferring the belief state (i.e. the probability distribution over process states) based on incomplete and noisy observations. This can be a hard problem in complex processes with large state spaces. In this article, we explore the idea of accelerating the filtering task by automatically exploiting causality in the process. We consider a specific type of causal relation, called passivity, which pertains to how state variables cause changes in other variables. We present the Passivity-based Selective Belief Filtering (PSBF) method, which maintains a factored belief representation and exploits passivity to perform selective updates over the belief factors. PSBF produces exact belief states under certain assumptions and approximate belief states otherwise, where the approximation error is bounded by the degree of uncertainty in the process. We show empirically, in synthetic processes with varying sizes and degrees of passivity, that PSBF is faster than several alternative methods while achieving competitive accuracy. Furthermore, we demonstrate how passivity occurs naturally in a complex system such as a multi-robot warehouse, and how PSBF can exploit this to accelerate the filtering task.


Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation

Journal of Artificial Intelligence Research

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.


Naive Bayes for Dummies; A Simple Explanation

@machinelearnbot

This blog post was originally published as part of an ongoing series, "Popular Algorithms Explained in Simple English" on the AYLIEN Text Analysis Blog. Commonly used in Machine Learning, Naive Bayes is a collection of classification algorithms based on Bayes Theorem. It is not a single algorithm but a family of algorithms that all share a common principle, that every feature being classified is independent of the value of any other feature. So for example, a fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A Naive Bayes classifier considers each of these "features" (red, round, 3" in diameter) to contribute independently to the probability that the fruit is an apple, regardless of any correlations between features.


Sparse group factor analysis for biclustering of multiple data sources

arXiv.org Machine Learning

Motivation: Modelling methods that find structure in data are necessary with the current large volumes of genomic data, and there have been various efforts to find subsets of genes exhibiting consistent patterns over subsets of treatments. These biclustering techniques have focused on one data source, often gene expression data. We present a Bayesian approach for joint biclustering of multiple data sources, extending a recent method Group Factor Analysis (GFA) to have a biclustering interpretation with additional sparsity assumptions. The resulting method enables data-driven detection of linear structure present in parts of the data sources. Results: Our simulation studies show that the proposed method reliably infers bi-clusters from heterogeneous data sources. We tested the method on data from the NCI-DREAM drug sensitivity prediction challenge, resulting in an excellent prediction accuracy. Moreover, the predictions are based on several biclusters which provide insight into the data sources, in this case on gene expression, DNA methylation, protein abundance, exome sequence, functional connectivity fingerprints and drug sensitivity.


Bayes' Theorem And Robot Arms Open Data Science Conferences

#artificialintelligence

If you enjoyed Jesse's presentation at ODSC's last Boston Big Data Conference come to ODSC East this May to hear out his colleagues. Rather than start with the statement of Bayes' Theorem, I want to use an old math teacher trick (which I realize many students hate) of trying to derive it from scratch, without stating what we're trying to derive. Rather, we'll start by modifying a problem that I described in an earlier post on probability distributions1. Bayes' gives you a way of determining the probability that a given event will occur, or that a given condition is true, given your knowledge of another related event or condition. All the examples that I've read or heard about seemed somewhat contrived and unrelated to the sorts of data analysis I was interested in.


Temporal Topic Analysis with Endogenous and Exogenous Processes

AAAI Conferences

We consider the problem of modeling temporal textual data taking endogenous and exogenous processes into account. Such text documents arise in real world applications, including job advertisements and economic news articles, which are influenced by the fluctuations of the general economy. We propose a hierarchical Bayesian topic model which imposes a "group-correlated" hierarchical structure on the evolution of topics over time incorporating both processes, and show that this model can be estimated from Markov chain Monte Carlo sampling methods. We further demonstrate that this model captures the intrinsic relationships between the topic distribution and the time-dependent factors, and compare its performance with latent Dirichlet allocation (LDA) and two other related models. The model is applied to two collections of documents to illustrate its empirical performance: online job advertisements from DirectEmployers Association and journalists' postings on BusinessInsider.com.


Bayesian Deduction with Subjective Opinions

AAAI Conferences

Subjective opinions can represent uncertain probabilistic information of any kind, minor or major A Bayesian network (BN) is a compact representation of a imprecision and even total ignorance about the probability joint probability distribution in the form of a directed acyclic distribution, by varying the uncertainty mass between 0 and graph (DAG) with random variables as nodes, and a set 1. By simply substituting every input conditional probability of conditional probability distributions associated with each distribution in a BN with a subjective opinion, we obtain node representing the probabilistic connection of the node what we call a subjective Bayesian network.


Solving PP PP -Complete Problems Using Knowledge Compilation

AAAI Conferences

Knowledge compilation has been successfully used to solve beyond NP problems, including some PP-complete and NP PP -complete problems for Bayesian networks. In this work we show how knowledge compilation can be used to solve problems in the more intractable complexity class PP^PP.  This class contains NP PP and includes interesting AI problems, such as non-myopic value of information. We show how to solve the prototypical PP PP -complete problem MajMajsat in linear-time once the problem instance is compiled into a special class of Sentential Decision Diagrams. To show the practical value of our approach, we adapt it to answer the Same-Decision Probability (SDP) query, which was recently introduced for Bayesian networks. The SDP problem is also PP PP P-complete. It is a value-of-information query that quantifies the robustness of threshold-based decisions and comes with a corresponding algorithm that was also recently proposed. We present favorable experimental results, comparing our new algorithm based on knowledge compilation with the state-of-the-art algorithm for computing the SDP.


On Partial Information and Contradictions in Probabilistic Abstract Argumentation

AAAI Conferences

We provide new insights into the area of combining abstract argumentation frameworks with probabilistic reasoning. In particular, we consider the scenario when assessments on the probabilities of a subset of the arguments is given and the probabilities of the remaining arguments have to be derived, taking both the topology of the argumentation framework and principles of probabilistic reasoning into account. We generalize this scenario by also considering inconsistent assessments, i.e., assessments that contradict the topology of the argumentation framework. Building on approaches to inconsistency measurement, we present a general framework to measure the amount of conflict of these assessments and provide a method for inconsistent-tolerant reasoning.