Statistical Learning
Reinforcement Learning Based on On-Line EM Algorithm
The actor and the critic are approximated by Normalized Gaussian Networks (NGnet), which are networks of local linear regression units. The NGnet is trained by the online EM algorithm proposed in our previous paper.We apply our RL method to the task of swinging-up and stabilizing a single pendulum and the task of balancing a double pendulumnear the upright position. The experimental results show that our RL method can be applied to optimal control problems havingcontinuous state/action spaces and that the method achieves good control with a small number of trial-and-errors. 1 INTRODUCTION Reinforcement learning (RL) methods (Barto et al., 1990) have been successfully applied to various Markov decision problems having finite state/action spaces, such as the backgammon game (Tesauro, 1992) and a complex task in a dynamic environment (Lin,1992). On the other hand, applications to continuous state/action problems (Werbos, 1990; Doya, 1996; Sofge & White, 1992) are much more difficult than the finite state/action cases. Good function approximation methods and fast learning algorithms are crucial for successful applications.
Robot Docking Using Mixtures of Gaussians
Williamson, Matthew M., Murray-Smith, Roderick, Hansen, Volker
This paper applies the Mixture of Gaussians probabilistic model, combined withExpectation Maximization optimization to the task of summarizing threedimensional range data for a mobile robot. This provides a flexible way of dealing with uncertainties in sensor information, and allows theintroduction of prior knowledge into low-level perception modules. Problemswith the basic approach were solved in several ways: the mixture of Gaussians was reparameterized to reflect the types of objects expected in the scene, and priors on model parameters were included in the optimization process. Both approaches force the optimization to find'interesting' objects, given the sensor and object characteristics. A higher level classifier was used to interpret the results provided by the model, and to reject spurious solutions.
Graph Matching for Shape Retrieval
Huet, Benoit, Cross, Andrew D. J., Hancock, Edwin R.
We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.
Adding Constrained Discontinuities to Gaussian Process Models of Wind Fields
Cornford, Dan, Nabney, Ian T., Williams, Christopher K. I.
Gaussian Processes provide good prior models for spatial data, but can be too smooth. In many physical situations there are discontinuities along bounding surfaces, for example fronts in near-surface wind fields. We describe a modelling method for such a constrained discontinuity and demonstrate how to infer the model parameters in wind fields with MCMC sampling.
Classification in Non-Metric Spaces
Weinshall, Daphna, Jacobs, David W., Gdalyahu, Yoram
A key question in vision is how to represent our knowledge of previously encountered objects to classify new ones. The answer depends on how we determine the similarity of two objects. Similarity tells us how relevant each previously seen object is in determining the category to which a new object belongs.
Support Vector Machines Applied to Face Recognition
On the other hand, in 804 P.J Phillips face recognition, there are many individuals (classes), and only a few images (samples) per person, and algorithms must recognize faces by extrapolating from the training samples. In numerous applications there can be only one training sample (image) of each person. Support vector machines (SVMs) are formulated to solve a classical two class pattern recognition problem. We adapt SVM to face recognition by modifying the interpretation of the output of a SVM classifier and devising a representation of facial images that is concordant with a two class problem. Traditional SVM returns a binary value, the class of the object.
The Bias-Variance Tradeoff and the Randomized GACV
Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara
We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.
Probabilistic Visualisation of High-Dimensional Binary Data
We present a probabilistic latent-variable framework for data visualisation, akey feature of which is its applicability to binary and categorical data types for which few established methods exist. A variational approximation to the likelihood is exploited to derive a fast algorithm for determining the model parameters. Illustrations of application to real and synthetic binary data sets are given.
Batch and On-Line Parameter Estimation of Gaussian Mixtures Based on the Joint Entropy
Singer, Yoram, Warmuth, Manfred K.
We describe a new iterative method for parameter estimation of Gaussian mixtures.The new method is based on a framework developed by Kivinen and Warmuth for supervised online learning. In contrast to gradient descentand EM, which estimate the mixture's covariance matrices, the proposed method estimates the inverses of the covariance matrices. Furthennore, the new parameter estimation procedure can be applied in both online and batch settings. We show experimentally that it is typically fasterthan EM, and usually requires about half as many iterations as EM. 1 Introduction Mixture models, in particular mixtures of Gaussians, have been a popular tool for density estimation, clustering, and unsupervised learning with a wide range of applications (see for instance [5, 2] and the references therein). Mixture models are one of the most useful tools for handling incomplete data, in particular hidden variables.
Kernel PCA and De-Noising in Feature Spaces
Mika, Sebastian, Schölkopf, Bernhard, Smola, Alex J., Müller, Klaus-Robert, Scholz, Matthias, Rätsch, Gunnar
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered asa natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCAlive in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images,focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.