AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Edge.org

#artificialintelligenceApr-24-2016, 17:15:59 GMT

Perhaps the most important news of our day is that datasets--not algorithms--might be the key limiting factor to development of human-level artificial intelligence. At the dawn of the field of artificial intelligence, in 1967, two of its founders famously anticipated that solving the problem of computer vision would take only a summer. Now, almost a half century later, machine learning software finally appears poised to achieve human-level performance on vision tasks and a variety of other grand challenges. What took the AI revolution so long? A review of the timing of the most publicized AI advances over the past thirty years suggests a provocative explanation: perhaps many major AI breakthroughs have actually been constrained by the availability of high-quality training datasets, and not by algorithmic advances.

artificial intelligence, deep learning, machine learning, (11 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Chess (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

Add feedback

A Minimalistic Approach to Sum-Product Network Learning for Real Applications

Krakovna, Viktoriya, Looks, Moshe

arXiv.org Machine LearningApr-24-2016

Sum-Product Networks (SPNs) are a class of expressive yet tractable hierarchical graphical models. LearnSPN is a structure learning algorithm for SPNs that uses hierarchical co-clustering to simultaneously identifying similar entities and similar features. The original LearnSPN algorithm assumes that all the variables are discrete and there is no missing data. We introduce a practical, simplified version of LearnSPN, MiniSPN, that runs faster and can handle missing data and heterogeneous features common in real applications. We demonstrate the performance of MiniSPN on standard benchmark datasets and on two datasets from Google's Knowledge Graph exhibiting high missingness rates and a mix of discrete and continuous features.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1602.04259

Country:

North America > United States (0.14)
Europe > Spain (0.14)
Europe > Portugal (0.14)
Europe > France (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.35)

Add feedback

Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation

Başkaya, Osman, Jurgens, David

Journal of Artificial Intelligence ResearchApr-22-2016

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.

computational linguistic, mapping function, proceedings, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4917

AI Access Foundation

10999

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Naive Bayes for Dummies; A Simple Explanation

@machinelearnbotApr-21-2016, 18:22:15 GMT

This blog post was originally published as part of an ongoing series, "Popular Algorithms Explained in Simple English" on the AYLIEN Text Analysis Blog. Commonly used in Machine Learning, Naive Bayes is a collection of classification algorithms based on Bayes Theorem. It is not a single algorithm but a family of algorithms that all share a common principle, that every feature being classified is independent of the value of any other feature. So for example, a fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A Naive Bayes classifier considers each of these "features" (red, round, 3" in diameter) to contribute independently to the probability that the fruit is an apple, regardless of any correlations between features.

artificial intelligence, machine learning, naive baye, (9 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Robust Estimators in High Dimensions without the Computational Intractability

Diakonikolas, Ilias, Kamath, Gautam, Kane, Daniel, Li, Jerry, Moitra, Ankur, Stewart, Alistair

arXiv.org Machine LearningApr-21-2016

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension-dependent factors in their error guarantees. This raises the following question:Is high-dimensional agnostic distribution learning even possible, algorithmically? In this work, we obtain the first computationally efficient algorithms with dimension-independent error guarantees for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of spherical Gaussians. Our algorithms achieve error that is independent of the dimension, and in many cases scales nearly-linearly with the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.

algorithm, gaussian, product distribution, (15 more...)

arXiv.org Machine Learning

1604.06443

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

Markov models for ocular fixation locations in the presence and absence of colour

Kashlak, Adam B., Devane, Eoin, Dietert, Helge, Jackson, Henry

arXiv.org Machine LearningApr-21-2016

We propose to model the fixation locations of the human eye when observing a still image by a Markovian point process in R 2 . Our approach is data driven using k-means clustering of the fixation locations to identify distinct salient regions of the image, which in turn correspond to the states of our Markov chain. Bayes factors are computed as model selection criterion to determine the number of clusters. Furthermore, we demonstrate that the behaviour of the human eye differs from this model when colour information is removed from the given image.

artificial intelligence, fixation, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1111/rssc.12223

1604.06335

Genre: Research Report > Experimental Study (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)

Add feedback

Sparse group factor analysis for biclustering of multiple data sources

Bunte, Kerstin, Leppäaho, Eemeli, Saarinen, Inka, Kaski, Samuel

arXiv.org Machine LearningApr-21-2016

Motivation: Modelling methods that find structure in data are necessary with the current large volumes of genomic data, and there have been various efforts to find subsets of genes exhibiting consistent patterns over subsets of treatments. These biclustering techniques have focused on one data source, often gene expression data. We present a Bayesian approach for joint biclustering of multiple data sources, extending a recent method Group Factor Analysis (GFA) to have a biclustering interpretation with additional sparsity assumptions. The resulting method enables data-driven detection of linear structure present in parts of the data sources. Results: Our simulation studies show that the proposed method reliably infers bi-clusters from heterogeneous data sources. We tested the method on data from the NCI-DREAM drug sensitivity prediction challenge, resulting in an excellent prediction accuracy. Moreover, the predictions are based on several biclusters which provide insight into the data sources, in this case on gene expression, DNA methylation, protein abundance, exome sequence, functional connectivity fingerprints and drug sensitivity.

artificial intelligence, bioinformatics, machine learning, (21 more...)

arXiv.org Machine Learning

doi: 10.1093/bioinformatics/btw207

1512.08808

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.96)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Bayes' Theorem And Robot Arms Open Data Science Conferences

#artificialintelligenceApr-20-2016, 21:51:34 GMT

If you enjoyed Jesse's presentation at ODSC's last Boston Big Data Conference come to ODSC East this May to hear out his colleagues. Rather than start with the statement of Bayes' Theorem, I want to use an old math teacher trick (which I realize many students hate) of trying to derive it from scratch, without stating what we're trying to derive. Rather, we'll start by modifying a problem that I described in an earlier post on probability distributions1. Bayes' gives you a way of determining the probability that a given event will occur, or that a given condition is true, given your knowledge of another related event or condition. All the examples that I've read or heard about seemed somewhat contrived and unrelated to the sorts of data analysis I was interested in.

artificial intelligence, bayesian inference, machine learning, (11 more...)

#artificialintelligence

Industry: Education > Curriculum > Subject-Specific Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.73)

Add feedback

Deep Learning in Neural Networks: An Overview

#artificialintelligenceApr-20-2016, 19:55:45 GMT

What a wonderful treasure trove this paper is! Schmidhuber provides all the background you need to gain an overview of deep learning (as of 2014) and how we got there through the preceding decades. Starting from recent DL results, I tried to trace back the origins of relevant ideas through the past half century and beyond. The main part of the paper runs to 35 pages, and then there are 53 pages of references. Now, I know that many of you think I read a lot of papers – just over 200 a year on this blog – but if I did nothing but review these key works in the development of deep learning it would take me about 4.5 years to get through them at that rate! And when I'd finished I'd still be about 6 years behind the then current state of the art!

artificial intelligence, deep learning, machine learning, (15 more...)

#artificialintelligence

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Add feedback

Elements of machine learning

@machinelearnbotApr-19-2016, 23:05:10 GMT

The official title of this free book available in PDF format is Machine Learning Cheat Sheet. See table of content screenshot below. The chapters 17 to 28 (the most interesting ones in my opinion) seem like a work in progress - I'm sure the authors intend to make them a bit bigger. For a more modern and applied book, get Dr Granville's book on data science.

artificial intelligence, inference, machine learning, (4 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback