Goto

Collaborating Authors

 Country


Weak Learners and Improved Rates of Convergence in Boosting

Neural Information Processing Systems

We present an algorithm that produces a linear classifier that is guaranteed to achieve an error better than random guessing for any distribution on the data. While this weak learner is not useful for learning in general, we show that under reasonable conditions on the distribution it yields an effective weak learner for one-dimensional problems. Preliminary simulations suggest that similar behavior can be expected in higher dimensions, a result which is corroborated by some recent theoretical bounds. Additionally, weprovide improved convergence rate bounds for the generalization errorin situations where the empirical error can be made small, which is exactly the situation that occurs if weak learners with guaranteed performance that is better than random guessing can be established.


`N-Body' Problems in Statistical Learning

Neural Information Processing Systems

We present efficient algorithms for all-point-pairs problems, or'Nbody'-like problems,which are ubiquitous in statistical learning. We focus on six examples, including nearest-neighbor classification, kernel density estimation, outlier detection, and the two-point correlation.


Sparse Kernel Principal Component Analysis

Neural Information Processing Systems

'Kernel' principal component analysis (PCA) is an elegant nonlinear generalisationof the popular linear data analysis method, where a kernel function implicitly defines a nonlinear transformation intoa feature space wherein standard PCA is performed. Unfortunately, thetechnique is not'sparse', since the components thus obtained are expressed in terms of kernels associated with every trainingvector. This paper shows that by approximating the covariance matrix in feature space by a reduced number of example vectors,using a maximum-likelihood approach, we may obtain a highly sparse form of kernel PCA without loss of effectiveness. 1 Introduction Principal component analysis (PCA) is a well-established technique for dimensionality reduction,and examples of its many applications include data compression, image processing, visualisation, exploratory data analysis, pattern recognition and time series prediction.


Stagewise Processing in Error-correcting Codes and Image Restoration

Neural Information Processing Systems

Both mean-field analysis using the cavity method and simulations showthat it has the advantage of being robust against uncertainties in hyperparameter estimation. 1 Introduction In error-correcting codes [1] and image restoration [2], the choice of the so-called hyperparameters is an important factor in determining their performances.


High-temperature Expansions for Learning Models of Nonnegative Data

Neural Information Processing Systems

Recent work has exploited boundedness of data in the unsupervised learning of new types of generative model. For nonnegative data it was recently shown that the maximum-entropy generative model is a Nonnegative BoltzmannDistribution not a Gaussian distribution, when the model is constrained to match the first and second order statistics of the data. Learning for practical sized problems is made difficult by the need to compute expectations under the model distribution. The computational costof Markov chain Monte Carlo methods and low fidelity of naive mean field techniques has led to increasing interest in advanced mean field theories and variational methods. Here I present a secondorder mean-fieldapproximation for the Nonnegative Boltzmann Machine model, obtained using a "high-temperature" expansion. The theory is tested on learning a bimodal 2-dimensional model, a high-dimensional translationally invariant distribution, and a generative model for handwritten digits.



Robust Reinforcement Learning

Neural Information Processing Systems

KenjiDoya ATR International; CREST, JST 2-2 Hikaridai Seika-cho Soraku-gun Kyoto 619-0288 JAPAN doya@isd.atr.co.jp Abstract This paper proposes a new reinforcement learning (RL) paradigm that explicitly takes into account input disturbance as well as modeling errors.The use of environmental models in RL is quite popular for both off-line learning by simulations and for online action planning. However, the difference between the model and the real environment can lead to unpredictable, often unwanted results. Based on the theory of H oocontrol, we consider a differential game in which a'disturbing' agent (disturber) tries to make the worst possible disturbance while a'control' agent (actor) tries to make the best control input. The problem is formulated as finding a minmax solutionof a value function that takes into account the norm of the output deviation and the norm of the disturbance. We derive online learning algorithms for estimating the value function and for calculating the worst disturbance and the best control in reference tothe value function.




One Microphone Source Separation

Neural Information Processing Systems

Source separation, or computational auditory scene analysis, attempts to extract individual acoustic objects from input which contains a mixture of sounds from different sources, altered by the acoustic environment. Unmixing algorithms such as lCA and its extensions recover sources by reweighting multiple observation sequences,and thus cannot operate when only a single observation signal is available. I present a technique called refiltering which recovers sources by a nonstationary reweighting ("masking") of frequency sub-bands from a single recording, and argue for the application of statistical algorithms to learning this masking function. I present results of a simple factorial HMM system which learns on recordings of single speakers and can then separate mixtures using only one observation signal by computing the masking function and then refiltering.