AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Ma, Xuezhe, Hovy, Eduard

arXiv.org Machine LearningMay-28-2016

State-of-the-art sequence labeling systems traditionally require large amounts of task-specific knowledge in the form of handcrafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word-and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data pre-processing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks -- Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both datasets -- 97.55% accuracy for POS tagging and 91.21% F1 for NER. 1 Introduction Linguistic sequence labeling, such as part-of- speech (POS) tagging and named entity recognition (NER), is one of the first stages in deep language understanding and its importance has been well recognized in the natural language processing community. Most traditional high performance sequence labeling models are linear statistical models, including Hidden Markov Models (HMM) and Conditional Random Fields (CRF) (Ratinov and Roth, 2009; Passos et al., 2014; Luo et al., 2015), which rely heavily on handcrafted features and task-specific resources. For example, English POS taggers benefit from carefully designed word spelling features; orthographic features and external resources such as gazetteers are widely used in NER. However, such task-specific knowledge is costly to develop (Ma and Xia, 2014), making sequence labeling models difficult to adapt to new tasks or new domains. In the past few years, nonlinear neural networks with as input distributed word representations, also known as word embeddings, have been broadly applied to NLP problems with great success.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1603.01354

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Spatial Semantic Scan: Jointly Detecting Subtle Events and their Spatial Footprint

Maurya, Abhinav

arXiv.org Machine LearningMay-28-2016

Many methods have been proposed for detecting emerging events in text streams using topic modeling. However, these methods have shortcomings that make them unsuitable for rapid detection of locally emerging events on massive text streams. We describe Spatially Compact Semantic Scan (SCSS) that has been developed specifically to overcome the shortcomings of current methods in detecting new spatially compact events in text streams. SCSS employs alternating optimization between using semantic scan (Liu and Neill (2011)) to estimate contrastive foreground topics in documents, and discovering spatial neighborhoods (Shao et al. (2011)) with high occurrence of documents containing the foreground topics. We evaluate our method on Emergency Department chief complaints dataset (ED dataset) to verify the effectiveness of our method in detecting real-world disease outbreaks from free-text ED chief complaint data.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1511.00352

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Health Care Providers & Services (0.48)
Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Variational Tempering

Mandt, Stephan, McInerney, James, Abrol, Farhan, Ranganath, Rajesh, Blei, David

arXiv.org Machine LearningMay-28-2016

Variational inference (VI) combined with data subsampling enables approximate posterior inference over large data sets, but suffers from poor local optima. We first formulate a deterministic annealing approach for the generic class of conditionally conjugate exponential family models. This approach uses a decreasing temperature parameter which deterministically deforms the objective during the course of the optimization. A well-known drawback to this annealing approach is the choice of the cooling schedule. We therefore introduce variational tempering, a variational algorithm that introduces a temperature latent variable to the model. In contrast to related work in the Markov chain Monte Carlo literature, this algorithm results in adaptive annealing schedules. Lastly, we develop local variational tempering, which assigns a latent temperature to each data point; this allows for dynamic annealing that varies across data. Compared to the traditional VI, all proposed approaches find improved predictive likelihoods on held-out data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1411.181

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Naïve-Bayes Technique for Machine Learning

#artificialintelligenceMay-27-2016, 23:46:38 GMT

"We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances." "When you have two competing theories that make exactly the same predictions, the simpler one is the better." One famous example of Occam's Razor in action is found in conspiracy theories surrounding the NASA moon landings. Many conspiracy theorists believe that the first Moon Landing was staged and filmed in a studio, part of an elaborate hoax. Their justification relies upon many twisted and convoluted theories, whereas the NASA argument is fairly straightforward.

artificial intelligence, assumption, machine learning, (10 more...)

#artificialintelligence

Genre: Research Report (0.31)

Industry: Government > Space Agency (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

Let Me Hear Your Voice and I'll Tell You How You Feel

#artificialintelligenceMay-27-2016, 08:25:37 GMT

Creating mood sensing technology has become very popular in recent years. There is a wide range of companies trying to detect your emotions from what you write, the tone of your voice, or from the expressions on your face. All of these companies offer their technology online through cloud-based programming interfaces (APIs). As part of my offline emotion sensing hardware (Project Jammin), I have already built early prototypes of facial expression and speech content recognition for emotion detection. In this short article I describe the missing part, a voice tone analyzer.

artificial intelligence, emotion, machine learning, (9 more...)

#artificialintelligence

Country: Asia > Taiwan (0.06)

Industry: Information Technology (0.38)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Mastering Machine Learning With scikit-learn

#artificialintelligenceMay-27-2016, 05:00:32 GMT

If you are a software developer who wants to learn how machine learning models work and how to apply them effectively, this book is for you. Familiarity with machine learning fundamentals and Python will be helpful, but is not essential. This book examines machine learning models including logistic regression, decision trees, and support vector machines, and applies them to common problems such as categorizing documents and classifying images. It begins with the fundamentals of machine learning, introducing you to the supervised-unsupervised spectrum, the uses of training and test data, and evaluating models. You will learn how to use generalized linear models in regression problems, as well as solve problems with text and categorical features. You will be acquainted with the use of logistic regression, regularization, and the various loss functions that are used by generalized linear models.

artificial intelligence, generalized linear model, mastering machine learning, (1 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.42)

Add feedback

Variational Bayesian Inference for Hidden Markov Models With Multivariate Gaussian Output Distributions

Gruhl, Christian, Sick, Bernhard

arXiv.org Machine LearningMay-27-2016

Hidden Markov Models (HMM) are a standard technique in time series analysis or data mining. Given a (set of) time series sample data, they are typically trained by means of a special variant of an expectation maximization (EM) algorithm, the Baum-Welch algorithm. HMM are used for gesture recognition, machine tool monitoring, or speech recognition, for instance. Second-order techniques are used to find values for parameters of probabilistic models from sample data. The parameters are regarded as random variables, and distributions are defined over these variables. These type of these second-order distributions depends on the type of the underlying probabilistic models. Typically, so called conjugate distributions are chosen, e.g., a Gaussian-Wishart distribution for an underlying Gaussian for which mean and covariance matrix have to be determined. Second-order techniques have some advantages over conventional approaches, e.g.,

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

1605.08618

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Particle Metropolis-adjusted Langevin algorithms

Nemeth, Christopher, Sherlock, Chris, Fearnhead, Paul

arXiv.org Machine LearningMay-27-2016

Markov chain Monte Carlo algorithms are a popular and well-studied methodology that can be used to draw samples from posterior distributions. Over the past few years these algorithms have been extended to tackle problems where the model likelihood is intractable (Beaumont, 2003). Andrieu and Roberts (2009) showed that within the Metropolis-Hastings algorithm, if the likelihood is replaced with an unbiased estimate, then the sampler still targets the correct stationary distribution. Andrieu et al. (2010) extended this work further to create a class of 1 Markov chain algorithms that use sequential Monte Carlo methods, also known as particle filters. Current implementations of pseudo-marginal and particle Markov chain Monte Carlo use random-walk proposals to update the parameters (e.g., Golightly and Wilkinson, 2011; Knape and de Valpine, 2012) and shall be referred to herein as particle random-walk Metropolis algorithms. Random walk-based algorithms propose a new value from some symmetric density centred on the current value.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1412.7299

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

EEF: Exponentially Embedded Families with Class-Specific Features for Classification

Tang, Bo, Kay, Steven, He, Haibo, Baggenstoss, Paul M.

arXiv.org Machine LearningMay-27-2016

Classification is one of fundamental problems in the fields of machine learning and signal processing. The commonly used classifier assigns a sample or a signal to the class with maximum posterior probability, which usually requires probability density function (PDF) estimation in an either model-driven or data-driven manner [1] [2] [3]. For high-dimensional data sets, it is necessary to perform feature reduction to estimate the PDFs robustly in a lowdimensional feature subspace. However, feature reduction may lose pertinent information for discrimination. For example, data samples from different classes that could be well separated in the raw data space may be overlapped in the feature subspace, causing classification errors. The PDF reconstruction approach provides a solution to address this information loss issue in feature reduction by reconstructing the PDF on raw data and making classification in raw data space, which could improve classification performance. Several approaches have been developed along this track.

artificial intelligence, classifier, machine learning, (12 more...)

arXiv.org Machine Learning

doi: 10.1109/LSP.2016.2574327

1605.03631

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Add feedback

Job opportunities (The University of Manchester)

@machinelearnbotMay-26-2016, 05:11:45 GMT

This is an exciting opportunity for a researcher at post-doctoral level with experience of machine learning and data mining. You will work with senior data scientists based within the local NHS trusts, the University of Manchester Health eResearch Centre, and Health Innovation Manchester to automate data extraction of predetermined features for all patients diagnosed with ovarian and colorectal cancer in the conurbation. Machine learning tools including neural networks, support vector machines and naïve Bayes algorithms will be refined and tested using the datasets accrued and optimised for clinical practice. Accuracy of prediction will be assessed using predefined criteria. Knowledge of cancer treatment would be useful but is not essential, as the team has extensive expertise in this area.

artificial intelligence, machine learning, manchester, (7 more...)

@machinelearnbot

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Add feedback