AITopics

In this setting, each pattern, represented as an n-dimensional feature vector, is associated with a discrete pattern class, or state of nature (Duda and Hart, 1973). Using available information, (e.g., a statistically representative set of labeled feature vectors

artificial intelligence, classifier, machine learning, (16 more...)

Country: North America > United States > Vermont > Chittenden County > Burlington (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Platt, John C., Allen, Timothy P.

A Neural Network Classifier for the I100 OCR Chip

Therefore, we want c to be less than 0.5. In order to get a 2:1 margin, we choose c 0.25. The classifier is trained only on individual partial characters instead of all possible combinations of partial characters. Therefore, we can specify the classifier using only 1523 constraints, instead of creating a training set of approximately 128,000 possible combinations of partial characters. Applying these constraints is therefore much faster than back-propagation on the entire data set.

artificial intelligence, constraint, machine learning, (16 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.42)

Frey, Brendan J., Hinton, Geoffrey E., Dayan, Peter

Does the Wake-sleep Algorithm Produce Good Density Estimators?

The wake-sleep algorithm (Hinton, Dayan, Frey and Neal 1995) is a relatively efficientmethod of fitting a multilayer stochastic generative model to high-dimensional data. In addition to the top-down connections inthe generative model, it makes use of bottom-up connections for approximating the probability distribution over the hidden units given the data, and it trains these bottom-up connections using a simple delta rule. We use a variety of synthetic and real data sets to compare the performance ofthe wake-sleep algorithm with Monte Carlo and mean field methods for fitting the same generative model and also compare it with other models that are less powerful but easier to fit. 1 INTRODUCTION Neural networks are often used as bottom-up recognition devices that transform input vectors intorepresentations of those vectors in one or more hidden layers. But multilayer networks ofstochastic neurons can also be used as top-down generative models that produce patterns with complicated correlational structure in the bottom visible layer. In this paper we consider generative models composed of layers of stochastic binary logistic units. Given a generative model parameterized by top-down weights, there is an obvious way to perform unsupervised learning. The generative weights are adjusted to maximize the probability thatthe visible vectors generated by the model would match the observed data.

artificial intelligence, helmholtz machine, machine learning, (13 more...)

Country:

North America > United States > Massachusetts (0.28)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)

Ormoneit, Dirk, Tresp, Volker

Improved Gaussian Mixture Density Estimates Using Bayesian Penalty Terms and Network Averaging

We compare two regularization methods which can be used to improve thegeneralization capabilities of Gaussian mixture density estimates. The first method uses a Bayesian prior on the parameter space.We derive EM (Expectation Maximization) update rules which maximize the a posterior parameter probability. In the second approachwe apply ensemble averaging to density estimation. This includes Breiman's "bagging", which recently has been found to produce impressive results for classification networks.

artificial intelligence, bayesian inference, machine learning, (10 more...)

Country:

Europe > Germany (0.15)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Hofmann, Reimar, Tresp, Volker

Discovering Structure in Continuous Variables Using Bayesian Networks

We study Bayesian networks for continuous variables using nonlinear conditionaldensity estimators. We demonstrate that useful structures can be extracted from a data set in a self-organized way and we present sampling techniques for belief update based on Markov blanket conditional density models. 1 Introduction One of the strongest types of information that can be learned about an unknown process is the discovery of dependencies and -even more important-of independencies. Asuperior example is medical epidemiology where the goal is to find the causes of a disease and exclude factors which are irrelevant.

artificial intelligence, bayesian network, machine learning, (18 more...)

Country: Europe > Germany (0.14)

Industry: Health & Medicine > Epidemiology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Unified Learning Scheme: Bayesian-Kullback Ying-Yang Machine

Xu, Lei

A Bayesian-Kullback learning scheme, called Ying-Yang Machine, is proposed based on the two complement but equivalent Bayesian representations for joint density and their Kullback divergence. Not only the scheme unifies existing major supervised and unsupervised learnings,including the classical maximum likelihood or least square learning, the maximum information preservation, the EM & em algorithm and information geometry, the recent popular Helmholtz machine, as well as other learning methods with new variants and new results; but also the scheme provides a number of new learning models. 1 INTRODUCTION Many different learning models have been developed in the literature. We may come to an age of searching a unified scheme for them. With a unified scheme, we may understand deeply the existing models and their relationships, which may cause cross-fertilization on them to obtain new results and variants; We may also be guided to develop new learning models, after we get better understanding on which cases we have already studied or missed, which deserve to be further explored. Recently, a Baysian-Kullback scheme, called the YING-YANG Machine, has been proposed as such an effort(Xu, 1995a). It bases on the Kullback divergence and two complement but equivalent Baysian representations for the joint distribution of the input space and the representation space, instead of merely using Kullback divergence formatching un-structuralized joint densities in information geometry type learnings (Amari, 1995a&b; Byrne, 1992; Csiszar, 1975).

artificial intelligence, machine learning, pm2, (16 more...)

Country: Asia > China (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Adaptive Mixture of Probabilistic Transducers

Singer, Yoram

We introduce and analyze a mixture model for supervised learning of probabilistic transducers. We devise an online learning algorithm that efficiently infers the structure and estimates the parameters of each model in the mixture. Theoretical analysis and comparative simulations indicate that the learning algorithm tracks the best model from an arbitrarily large (possibly infinite) pool of models. We also present an application of the model for inducing a noun phrase recognizer.

machine learning, natural language, suffix tree transducer, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

Bayesian Methods for Mixtures of Experts

Tel: [ 44] 1223 332815 ajr@eng.cam.ac.uk ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational freeenergy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country: Europe > United Kingdom > England (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mahadevan, Sridhar, Kaelbling, Leslie Pack

The National Science Foundation Workshop on Reinforcement Learning

AI MagazineDec-15-1996

Reinforcement learning has become one of the most actively studied learning frameworks in the area of intelligent autonomous agents. This article describes the results of a three-day meeting of leading researchers in this area that was sponsored by the National Science Foundation. Because reinforcement learning is an interdisciplinary topic, the workshop brought together researchers from a variety of fields, including machine learning, neural networks, AI, robotics, and operations research. Thirty leading researchers from the United States, Canada, Europe, and Japan, representing from many different universities, government, and industrial research laboratories participated in the workshop. The goals of the meeting were to (1) understand limitations of current reinforcement-learning systems and define promising directions for further research; (2) clarify the relationships between reinforcement learning and existing work in engineering fields, such as operations research; and (3) identify potential industrial applications of reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

AI Magazine

Country: North America > United States > California (0.28)

Industry: Government > Regional Government > North America Government > United States Government (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)