Goto

Collaborating Authors

 Asia



RoboCup-2001: The Fifth Robotic Soccer World Championships

AI Magazine

RoboCup-2001 was the Fifth International RoboCup Competition and Conference. It was held for the first time in the United States, following RoboCup-2000 in Melbourne, Australia; RoboCup-99 in Stockholm; RoboCup-98 in Paris; and RoboCup-97 in Osaka. This article discusses in detail each one of the events at RoboCup-2001, focusing on the competition leagues.


Ensemble Learning and Linear Response Theory for ICA

Neural Information Processing Systems

We propose a general Bayesian framework for performing independent (leA) which relies on ensemble learning and linearcomponent analysis response theory known from statistical physics. We apply it to both discrete and continuous sources. For the continuous source the underdetermined (overcomplete) case is studied. The naive mean-field approach fails in this case whereas linear response theory-which gives an improved estimate of covariances-is very efficient. The examples given are for sources without temporal correlations. However, this derivation can easily to treat temporal correlations. Finally, the frameworkbe extended of generating new leA algorithms without needingoffers a simple way to define the prior distribution of the sources explicitly.


Learning Curves for Gaussian Processes Regression: A Framework for Good Approximations

Neural Information Processing Systems

Based on a statistical mechanics approach, we develop a method for approximately computing average case learning curves for Gaussian process regression models. The approximation works well in the large sample size limit and for arbitrary dimensionality of the input space. We explain how the approximation can be systematically improved and argue that similar techniques can be applied to general likelihood models. 1 Introduction Gaussian process (GP) models have gained considerable interest in the Neural Computation Community (see e.g.[I, 2, 3, 4]) in recent years. Being nonparametric models by construction their theoretical understanding seems to be less well developed compared to simpler parametric models like neural networks. We are especially interested in developing theoretical approaches which will at least give good approximations to generalization errors when the number of training data is sufficiently large. In this paper we present a step in this direction which is based on a statistical mechanics approach.


Discovering Hidden Variables: A Structure-Based Approach

Neural Information Processing Systems

A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. As such, they induce seemingly complex dependencies among the latter. In recent years, much attention has been devoted to the development of algorithms for learning parameters, and in some cases structure, in the presence of hidden variables. In this paper, we address the related problem of detecting hidden variables that interact with the observed variables.


Propagation Algorithms for Variational Bayesian Learning

Neural Information Processing Systems

Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugate-exponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these results to the Bayesian analysis of linear-Gaussian state-space models we obtain a learning procedure that exploits the Kalman smoothing propagation, while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimensionality of the state-space model in a variety of synthetic problems and one real high-dimensional data set. 1 Introduction Bayesian approaches to machine learning have several desirable properties.


Natural Sound Statistics and Divisive Normalization in the Auditory System

Neural Information Processing Systems

We explore the statistical properties of natural sound stimuli preprocessed with a bank of linear filters. The responses of such filters exhibit a striking form of statistical dependency, in which the response variance of each filter grows with the response amplitude of filters tuned for nearby frequencies. These dependencies may be substantially reduced using an operation known as divisive normalization, in which the response of each filter is divided by a weighted sum of the rectified responses of other filters. The weights may be chosen to maximize the independence of the normalized responses for an ensemble of natural sounds. We demonstrate that the resulting model accounts for nonlinearities in the response characteristics of the auditory nerve, by comparing model simulations to electrophysiological recordings.


Stagewise Processing in Error-correcting Codes and Image Restoration

Neural Information Processing Systems

We introduce stagewise processing in error-correcting codes and image restoration, by extracting information from the former stage and using it selectively to improve the performance of the latter one. Both mean-field analysis using the cavity method and simulations show that it has the advantage of being robust against uncertainties in hyperparameter estimation. 1 Introduction In error-correcting codes [1] and image restoration [2], the choice of the so-called hyperparameters is an important factor in determining their performances. Hyperparameters refer to the coefficients weighing the biases and variances of the tasks. In error correction, they determine the statistical significance given to the paritychecking terms and the received bits. Similarly in image restoration, they determine the statistical weights given to the prior knowledge and the received data.


Recognizing Hand-written Digits Using Hierarchical Products of Experts

Neural Information Processing Systems

The product of experts learning procedure [1] can discover a set of stochastic binary features that constitute a nonlinear generative model of handwritten images of digits. The quality of generative models learned in this way can be assessed by learning a separate model for each class of digit and then comparing the unnormalized probabilities of test images under the 10 different class-specific models. To improve discriminative performance, it is helpful to learn a hierarchy of separate models for each digit class. Each model in the hierarchy has one layer of hidden units and the nth level model is trained on data that consists of the activities of the hidden units in the already trained (n - l)th level model. After training, each level produces a separate, unnormalized log probabilty score. With a three-level hierarchy for each of the 10 digit classes, a test image produces 30 scores which can be used as inputs to a supervised, logistic classification network that is trained on separate data. On the MNIST database, our system is comparable with current state-of-the-art discriminative methods, demonstrating that the product of experts learning procedure can produce effective generative models of high-dimensional data. 1 Learning products of stochastic binary experts Hinton [1] describes a learning algorithm for probabilistic generative models that are composed of a number of experts. Each expert specifies a probability distribution over the visible variables and the experts are combined by multiplying these distributions together and renormalizing.


Analysis of Bit Error Probability of Direct-Sequence CDMA Multiuser Demodulators

Neural Information Processing Systems

We analyze the bit error probability of multiuser demodulators for directsequence binary phase-shift-keying (DSIBPSK) CDMA channel with additive gaussian noise. The problem of multiuser demodulation is cast into the finite-temperature decoding problem, and replica analysis is applied to evaluate the performance of the resulting MPM (Marginal Posterior Mode) demodulators, which include the optimal demodulator and the MAP demodulator as special cases. An approximate implementation of demodulators is proposed using analog-valued Hopfield model as a naive mean-field approximation to the MPM demodulators, and its performance is also evaluated by the replica analysis. Results of the performance evaluation shows effectiveness of the optimal demodulator and the mean-field demodulator compared with the conventional one, especially in the cases of small information bit rate and low noise level.