AITopics

Due to the simplicity and efficiency of its parameter estimation algorithm, the hidden Markov model (HMM) has emerged as one of the basic statistical tools for modeling discrete time series, finding widespread application in the areas of speech recognition (Rabiner and Juang, 1986) and computational molecular biology (Baldi et al., 1994). An HMM is essentially a mixture model, encoding information about the history of a time series in the value of a single multinomial variable (the hidden state). This multinomial assumption allows an efficient parameter estimation algorithm to be derived (the Baum-Welch algorithm). However, it also severely limits the representational capacity of HMMs.

algorithm, markov model, state representation, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.07)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

A Unified Learning Scheme: Bayesian-Kullback Ying-Yang Machine

Xu, Lei

A Bayesian-Kullback learning scheme, called Ying-Yang Machine, is proposed based on the two complement but equivalent Bayesian representations for joint density and their Kullback divergence. Not only the scheme unifies existing major supervised and unsupervised learnings, including the classical maximum likelihood or least square learning, the maximum information preservation, the EM & em algorithm and information geometry, the recent popular Helmholtz machine, as well as other learning methods with new variants and new results; but also the scheme provides a number of new learning models. 1 INTRODUCTION Many different learning models have been developed in the literature. We may come to an age of searching a unified scheme for them. With a unified scheme, we may understand deeply the existing models and their relationships, which may cause cross-fertilization on them to obtain new results and variants; We may also be guided to develop new learning models, after we get better understanding on which cases we have already studied or missed, which deserve to be further explored. Recently, a Baysian-Kullback scheme, called the YING-YANG Machine, has been proposed as such an effort(Xu, 1995a). It bases on the Kullback divergence and two complement but equivalent Baysian representations for the joint distribution of the input space and the representation space, instead of merely using Kullback divergence for matching un-structuralized joint densities in information geometry type learnings (Amari, 1995a&b; Byrne, 1992; Csiszar, 1975).

pm2, representation, ylx, (13 more...)

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > Taiwan (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Family Discovery

Omohundro, Stephen M.

"Family discovery" is the task of learning the dimension and structure of a parameterized family of stochastic models. It is especially appropriate when the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning and show that they significantly improve performance over two alternatives on a parameterized classification task.

algorithm, family discovery algorithm, parameterized family, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Konig, Yochai, Bourlard, Hervé, Morgan, Nelson

REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition

In this paper, we introduce REMAP, an approach for the training and estimation of posterior probabilities using a recursive algorithm that is reminiscent of the EMbased Forward-Backward (Liporace 1982) algorithm for the estimation of sequence likelihoods. Although very general, the method is developed in the context of a statistical model for transition-based speech recognition using Artificial Neural Networks (ANN) to generate probabilities for Hidden Markov Models (HMMs). In the new approach, we use local conditional posterior probabilities of transitions to estimate global posterior probabilities of word sequences. Although we still use ANNs to estimate posterior probabilities, the network is trained with targets that are themselves estimates of local posterior probabilities. An initial experimental result shows a significant decrease in error-rate in comparison to a baseline system. 1 INTRODUCTION The ultimate goal in speech recognition is to determine the sequence of words that has been uttered.

algorithm, posterior probability, probability, (11 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)
North America > United States > Oregon (0.04)
Europe > Belgium (0.04)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Adaptive Mixture of Probabilistic Transducers

Singer, Yoram

We introduce and analyze a mixture model for supervised learning of probabilistic transducers. We devise an online learning algorithm that efficiently infers the structure and estimates the parameters of each model in the mixture. Theoretical analysis and comparative simulations indicate that the learning algorithm tracks the best model from an arbitrarily large (possibly infinite) pool of models. We also present an application of the model for inducing a noun phrase recognizer.

prediction, probability, suffix tree transducer, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

Bayesian Methods for Mixtures of Experts

ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

algorithm, bayesian method, hyperparameter, (12 more...)

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Snapp, Robert R., Xu, Tong

Estimating the Bayes Risk from Sample Data

In this setting, each pattern, represented as an n-dimensional feature vector, is associated with a discrete pattern class, or state of nature (Duda and Hart, 1973). Using available information, (e.g., a statistically representative set of labeled feature vectors

bayes risk, classification problem, classifier, (14 more...)

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Texas (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Learning the Structure of Similarity

Tenenbaum, Joshua B.

The additive clustering (ADCL US) model (Shepard & Arabie, 1979) treats the similarity of two stimuli as a weighted additive measure of their common features. Inspired by recent work in unsupervised learning with multiple cause models, we propose anew, statistically well-motivated algorithm for discovering the structure of natural stimulus classes using the ADCLUS model, which promises substantial gains in conceptual simplicity, practical efficiency, and solution quality over earlier efforts.

algorithm, cluster configuration, similarity, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)

Stable Fitted Reinforcement Learning

Gordon, Geoffrey J.

We describe the reinforcement learning problem, motivate algorithms which seek an approximation to the Q function, and present new convergence results for two such algorithms. 1 INTRODUCTION AND BACKGROUND Imagine an agent acting in some environment. At time t, the environment is in some state Xt chosen from a finite set of states. The agent perceives Xt, and is allowed to choose an action at from some finite set of actions. Meanwhile, the agent experiences a real-valued cost Ct, chosen from a distribution which also depends only on Xt and at and which has finite mean and variance. Such an environment is called a Markov decision process, or MDP.

algorithm, learning, q-iearning, (15 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Meila, Marina, Jordan, Michael I.

Learning Fine Motion by Markov Mixtures of Experts

Brain and Cognitive Sciences Massachussetts Inst. of Technology Massachussetts Inst. of Technology Cambridge, MA 02139 Cambridge, MA 02139 mmp@ai.mit.edu Abstract Compliant control is a standard method for performing fine manipulation tasks, like grasping and assembly, but it requires estimation of the state of contact (s.o.c.) between the robot arm and the objects involved. Here we present a method to learn a model of the movement from measured data. The method requires little or no prior knowledge and the resulting model explicitly estimates the s.o.c. The current s.o.c. is viewed as the hidden state variable of a discrete HMM.

algorithm, learning fine motion, markov mixture, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.75)
Asia > Middle East > Jordan (0.06)

Technology:

Information Technology > Artificial Intelligence > Robots (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)