AITopics | Markov Models

Collaborating Authors

Markov Models

News Overviews Instructional Materials AI-Alerts Classics

Some Applications of Markov Chain in Python

@machinelearnbotJan-19-2018, 02:47:54 GMT

In this article a few simple applications of Markov chain are going to be discussed as a solution to a few text processing problems. These problems appeared as assignments in a few courses, the descriptions are taken straightaway from the courses themselves. Use a Markov chain to create a statistical model of a piece of English text. Simulate the Markov chain to generate stylized pseudo-random text. In the 1948 landmark paper A Mathematical Theory of Communication, Claude Shannon founded the field of information theory and revolutionized the telecommunications industry, laying the groundwork for today's Information Age. In this paper, Shannon proposed using a Markov chain to create a statistical model of the sequences of letters in a piece of English text. Markov chains are now widely used in speech recognition, handwriting recognition, information retrieval, data compression, and spam filtering. They also have many scientific computing applications including the genemark algorithm for gene prediction, the Metropolis algorithm for measuring thermodynamical properties, and Google's PageRank algorithm for Web search.

machine learning, markov model, natural language, (18 more...)

@machinelearnbot

Country: North America > United States (0.14)

Industry: Telecommunications (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Automatic Classification of Music Genre using Masked Conditional Neural Networks

Medhat, Fady, Chesmore, David, Robinson, John

arXiv.org Machine LearningJan-16-2018

Neural network based architectures used for sound recognition are usually adapted from other application domains such as image recognition, which may not harness the time-frequency representation of a signal. The ConditionaL Neural Networks (CLNN) and its extension the Masked ConditionaL Neural Networks (MCLNN) are designed for multidimensional temporal signal recognition. The CLNN is trained over a window of frames to preserve the inter-frame relation, and the MCLNN enforces a systematic sparseness over the network's links that mimics a filterbank-like behavior. The masking operation induces the network to learn in frequency bands, which decreases the network susceptibility to frequency-shifts in time-frequency representations. Additionally, the mask allows an exploration of a range of feature combinations concurrently analogous to the manual handcrafting of the optimum collection of features for a recognition task. MCLNN have achieved competitive performance on the Ballroom music dataset compared to several hand-crafted attempts and outperformed models based on state-of-the-art Convolutional Neural Networks.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/ICDM.2017.125

1801.05504

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Conditional Random Fields (CRF): Short Survey

@machinelearnbotJan-15-2018, 15:03:16 GMT

Currently, many of us are overwhelmed with mighty power of Deep Learning. We start to forget about humble graphical models. CRF is not so trendy as LSTM, but it is robust, reliable and worth noting. In this post, you will find a short summary about CRF (aka Conditional Random Fields) – what is this thing, what is it for and some interesting facts. In practical implementation, the computational time is often larger due to many other operations like numerical scaling, smoothing etc.

artificial intelligence, crf, machine learning, (14 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

Add feedback

Generative Adversarial Networks (GANs): Engine and Applications

@machinelearnbotJan-11-2018, 21:16:35 GMT

GANs were introduced by Ian Goodfellow in 2014. Both of them are dedicated to extract features from data by learning the identity function f(x) x and both of them rely on Markov chains to train or to generate samples. Generative adversarial networks were designed to avoid using Markov chains because of the high computational cost of the latter. Another advantage relative to Boltzmann machines is that the Generator function has much fewer restrictions (there are only a few probability distributions that admit Markov chain sampling). In this article, we'll tell you how generative adversarial nets work and what their most popular applications in real life are.

artificial intelligence, discriminator, machine learning, (14 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Noisy Expectation-Maximization: Applications and Generalizations

Osoba, Osonde, Kosko, Bart

arXiv.org Machine LearningJan-11-2018

We present a noise-injected version of the Expectation-Maximization (EM) algorithm: the Noisy Expectation Maximization (NEM) algorithm. The NEM algorithm uses noise to speed up the convergence of the EM algorithm. The NEM theorem shows that injected noise speeds up the average convergence of the EM algorithm to a local maximum of the likelihood surface if a positivity condition holds. The generalized form of the noisy expectation-maximization (NEM) algorithm allow for arbitrary modes of noise injection including adding and multiplying noise to the data. We demonstrate these noise benefits on EM algorithms for the Gaussian mixture model (GMM) with both additive and multiplicative NEM noise injection. A separate theorem (not presented here) shows that the noise benefit for independent identically distributed additive noise decreases with sample size in mixture models. This theorem implies that the noise benefit is most pronounced if the data is sparse. Injecting blind noise only slowed convergence.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1801.04053

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre: Research Report > Experimental Study > Negative Result (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

The History Of Speech Recognition And A Glimpse Into Its Future

#artificialintelligenceJan-10-2018, 06:21:27 GMT

With the release of Apple's Siri and comparable voice search assistance from Microsoft and Google, you might have speculated why it took so long for speech recognition innovation to progress to this stage. In addition, one may also wonder what the future holds for natural language-based machine intelligence learning and its impact on our everyday lives. A closer look at the history and development of voice recognition technology may be somewhat akin to watching a toddler grow up, advancing from the baby-talk level and developing terminologies of countless words to responding to queries with fast, amusing repartees, just like what the clever digital assistant Siri does. Here is a close depiction at the innovations of the past generations with regards to speech recognition and what the future has in store for this technology. The "Audrey" system is the earliest speech recognition device that could recognize only digits.

artificial intelligence, machine learning, speech recognition, (6 more...)

#artificialintelligence

Country: North America > United States (0.16)

Industry:

Government > Military (0.33)
Government > Regional Government (0.32)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

Add feedback

Distributed Constraint Optimization Problems and Applications: A Survey

Fioretto, Ferdinando, Pontelli, Enrico, Yeoh, William

arXiv.org Artificial IntelligenceJan-10-2018

The field of Multi-Agent System (MAS) is an active area of research within Artificial Intelligence, with an increasingly important impact in industrial and other real-world applications. Within a MAS, autonomous agents interact to pursue personal interests and/or to achieve common objectives. Distributed Constraint Optimization Problems (DCOPs) have emerged as one of the prominent agent architectures to govern the agents' autonomous behavior, where both algorithms and communication models are driven by the structure of the specific problem. During the last decade, several extensions to the DCOP model have enabled them to support MAS in complex, real-time, and uncertain environments. This survey aims at providing an overview of the DCOP model, giving a classification of its multiple extensions and addressing both resolution methods and applications that find a natural mapping within each class of DCOPs. The proposed classification suggests several future perspectives for DCOP extensions, and identifies challenges in the design of efficient resolution algorithms, possibly through the adaptation of strategies from different areas.

agent, constraint-based reasoning, renewable energy, (21 more...)

arXiv.org Artificial Intelligence

1602.06347

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
Europe (0.14)
South America > Brazil (0.14)
(3 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy > Power Industry (1.00)
Information Technology > Security & Privacy (0.92)
Law (0.69)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Stochastic Gradient Monomial Gamma Sampler

Zhang, Yizhe, Chen, Changyou, Gan, Zhe, Henao, Ricardo, Carin, Lawrence

arXiv.org Machine LearningJan-10-2018

Recent advances in stochastic gradient techniques have made it possible to estimate posterior distributions from large datasets via Markov Chain Monte Carlo (MCMC). However, when the target posterior is multimodal, mixing performance is often poor. This results in inadequate exploration of the posterior distribution. A framework is proposed to improve the sampling efficiency of stochastic gradient MCMC, based on Hamiltonian Monte Carlo. A generalized kinetic function is leveraged, delivering superior stationary mixing, especially for multimodal distributions. Techniques are also discussed to overcome the practical issues introduced by this generalization. It is shown that the proposed approach is better at exploring complex multimodal posterior distributions, as demonstrated on multiple applications and in comparison with other stochastic gradient MCMC methods.

artificial intelligence, machine learning, sgmgt-d, (15 more...)

arXiv.org Machine Learning

1706.01498

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Weighted Contrastive Divergence

Merino, Enrique Romero, Castrillejo, Ferran Mazzanti, Pin, Jordi Delgado, Prats, David Buchaca

arXiv.org Machine LearningJan-8-2018

Learning algorithms for energy based Boltzmann architectures that rely on gradient descent are in general computationally prohibitive, typically due to the exponential number of terms involved in computing the partition function. In this way one has to resort to approximation schemes for the evaluation of the gradient. This is the case of Restricted Boltzmann Machines (RBM) and its learning algorithm Contrastive Divergence (CD). It is well-known that CD has a number of shortcomings, and its approximation to the gradient has several drawbacks. Overcoming these defects has been the basis of much research and new algorithms have been devised, such as persistent CD. In this manuscript we propose a new algorithm that we call Weighted CD (WCD), built from small modifications of the negative phase in standard CD. However small these modifications may be, experimental work reported in this paper suggest that WCD provides a significant improvement over standard CD and persistent CD at a small additional computational cost.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

1801.02567

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Log-concave sampling: Metropolis-Hastings algorithms are fast!

Dwivedi, Raaz, Chen, Yuansi, Wainwright, Martin J., Yu, Bin

arXiv.org Machine LearningJan-8-2018

We consider the problem of sampling from a strongly log-concave density in $\mathbb{R}^d$, and prove a non-asymptotic upper bound on the mixing time of the Metropolis-adjusted Langevin algorithm (MALA). The method draws samples by running a Markov chain obtained from the discretization of an appropriate Langevin diffusion, combined with an accept-reject step to ensure the correct stationary distribution. Relative to known guarantees for the unadjusted Langevin algorithm (ULA), our bounds show that the use of an accept-reject step in MALA leads to an exponentially improved dependence on the error-tolerance. Concretely, in order to obtain samples with TV error at most $\delta$ for a density with condition number $\kappa$, we show that MALA requires $\mathcal{O} \big(\kappa d \log(1/\delta) \big)$ steps, as compared to the $\mathcal{O} \big(\kappa^2 d/\delta^2 \big)$ steps established in past work on ULA. We also demonstrate the gains of MALA over ULA for weakly log-concave densities. Furthermore, we derive mixing time bounds for a zeroth-order method Metropolized random walk (MRW) and show that it mixes $\mathcal{O}(\kappa d)$ slower than MALA. We provide numerical examples that support our theoretical findings, and demonstrate the potential gains of Metropolis-Hastings adjustment for Langevin-type algorithms.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Machine Learning

1801.02309

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback