AITopics

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.15)

Industry: Government (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Neural Information Processing SystemsDec-31-1997

Ensemble Methods for Phoneme Classification

Waterhouse, Steve R., Cook, Gary

There is now considerable interest in using ensembles or committees of learning machines to improve the performance of the system over that of a single learning machine. In most neural network ensembles, the ensemble members are trained on either the same data (Hansen & Salamon 1990) or different subsets of the data (Perrone & Cooper 1993). The ensemble members typically have different initial conditions and/or different architectures. The subsets of the data may be chosen at random, with prior knowledge or by some principled approach e.g.

artificial intelligence, ensemble, neural network, (17 more...)

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Industry: Government (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Bayesian Methods for Mixtures of Experts

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

algorithm, artificial intelligence, bayesian inference, (15 more...)

Country: Europe > United Kingdom > England (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Methods for Mixtures of Experts

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

Tel: [ 44] 1223 332815 ajr@eng.cam.ac.uk ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational freeenergy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

algorithm, artificial intelligence, bayesian inference, (18 more...)

Country: Europe > United Kingdom > England (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Constructive Algorithms for Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

By applying a likelihood splitting criteria to each expert in the HME we "grow" the tree adaptively during training. Secondly,by considering only the most probable path through the tree we may "prune" branches away, either temporarily, or permanently ifthey become redundant. We demonstrate results for the growing and path pruning algorithms which show significant speed ups and more efficient use of parameters over the standard fixed structure in discriminating between two interlocking spirals and classifying 8-bit parity patterns. INTRODUCTION The HME (Jordan & Jacobs 1994) is a tree structured network whose terminal nodes are simple function approximators in the case of regression or classifiers in the case of classification. The outputs of the terminal nodes or experts are recursively combined upwards towards the root node, to form the overall output of the network, by "gates" which are situated at the non-terminal nodes.

artificial intelligence, neural network, node, (18 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom (0.28)
Asia > Middle East > Jordan (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Constructive Algorithms for Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

By applying a likelihood splitting criteria to each expert in the HME we "grow" the tree adaptively during training. Secondly, by considering only the most probable path through the tree we may "prune" branches away, either temporarily, or permanently if they become redundant. We demonstrate results for the growing and path pruning algorithms which show significant speed ups and more efficient use of parameters over the standard fixed structure in discriminating between two interlocking spirals and classifying 8-bit parity patterns. INTRODUCTION The HME (Jordan & Jacobs 1994) is a tree structured network whose terminal nodes are simple function approximators in the case of regression or classifiers in the case of classification. The outputs of the terminal nodes or experts are recursively combined upwards towards the root node, to form the overall output of the network, by "gates" which are situated at the non-terminal nodes.

artificial intelligence, neural network, node, (18 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom (0.28)
Asia > Middle East > Jordan (0.25)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Bayesian Methods for Mixtures of Experts

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

algorithm, artificial intelligence, bayesian inference, (15 more...)

Country: Europe > United Kingdom > England (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsDec-31-1995

Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

We are concerned in this paper with the application of multiple models, specifically the Hierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S. R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines (MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictors trained on different statistical classes of speech.

artificial intelligence, machine learning, prediction, (15 more...)

Country:

Europe > United Kingdom (0.28)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Neural Information Processing SystemsDec-31-1995

Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

We are concerned in this paper with the application of multiple models, specifically the Hierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S. R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines (MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictors trained on different statistical classes of speech.

artificial intelligence, machine learning, prediction, (15 more...)

Country:

Europe > United Kingdom (0.28)
North America > United States > Colorado > Boulder County > Boulder (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Neural Information Processing SystemsDec-31-1995

Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

We are concerned in this paper with the application of multiple models, specifically theHierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S.R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines(MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictorstrained on different statistical classes of speech.

artificial intelligence, machine learning, prediction, (16 more...)