Plotting

 Country


Learning Sparse Multiscale Image Representations

Neural Information Processing Systems

We describe a method for learning sparse multiscale image representations usinga sparse prior distribution over the basis function coefficients. The prior consists of a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values. Coefficients for an image are computed by sampling from the resulting posterior distribution with a Gibbs sampler. The learned basis is similar to the Steerable Pyramid basis, and yields slightly higher SNR for the same number of active coefficients. Denoising usingthe learned image model is demonstrated for some standard test images, with results that compare favorably with other denoising methods.


Boosted Dyadic Kernel Discriminants

Neural Information Processing Systems

We introduce a novel learning algorithm for binary classification with hyperplane discriminants based on pairs of training points from opposite classes (dyadic hypercuts). This algorithm is further extended to nonlinear discriminants using kernel functions satisfying Mercer'sconditions. An ensemble of simple dyadic hypercuts is learned incrementally by means of a confidence-rated version of AdaBoost, whichprovides a sound strategy for searching through the finite set of hypercut hypotheses. In experiments with real-world datasets from the UCI repository, the generalization performance of the hypercut classifiers was found to be comparable to that of SVMs and k-NN classifiers. Furthermore, the computational cost of classification (at run time) was found to be similar to, or better than,that of SVM. Similarly to SVMs, boosted dyadic kernel discriminants tend to maximize the margin (via AdaBoost). In contrast to SVMs, however, we offer an online and incremental learning machine for building kernel discriminants whose complexity (numberof kernel evaluations) can be directly controlled (traded off for accuracy).



Manifold Parzen Windows

Neural Information Processing Systems

The similarity between objects is a fundamental element of many learning algorithms.Most nonparametric methods take this similarity to be fixed, but much recent work has shown the advantages of learning it, in particular to exploit the local invariances in the data or to capture the possibly nonlinear manifold on which most of the data lies. We propose a new nonparametric kernel density estimation method which captures the local structure of an underlying manifold through the leading eigenvectors ofregularized local covariance matrices.


Discriminative Binaural Sound Localization

Neural Information Processing Systems

Time difference of arrival (TDOA) is commonly used to estimate the azimuth ofa source in a microphone array. The most common methods to estimate TDOA are based on finding extrema in generalized crosscorrelation waveforms.In this paper we apply microphone array techniques to a manikin head. By considering the entire cross-correlation waveform we achieve azimuth prediction accuracy that exceeds extrema locating methods. We do so by quantizing the azimuthal angle and treating the prediction problem as a multiclass categorization task. We demonstrate the merits of our approach by evaluating the various approaches onSony's AIBO robot.


Learning to Classify Galaxy Shapes Using the EM Algorithm

Neural Information Processing Systems

We describe the application of probabilistic model-based learning to the problem of automatically identifying classes of galaxies, based on both morphological and pixel intensity characteristics. The EM algorithm can be used to learn how to spatially orient a set of galaxies so that they are geometrically aligned. We augment this "ordering-model" with a mixture model on objects, and demonstrate how classes of galaxies can be learned in an unsupervised manner using a two-level EM algorithm. The resulting models provide highly accurate classi£cation of galaxies in cross-validation experiments.


Expected and Unexpected Uncertainty: ACh and NE in the Neocortex

Neural Information Processing Systems

Experimental and theoretical studies suggest that these different forms of variability play different behavioral, neural and computational roles, and may be reported by different (notably neuromodulatory) systems. Here, we refine ourprevious theory of acetylcholine's role in cortical inference in the (oxymoronic) terms of expected uncertainty, and advocate a theory for norepinephrine in terms of unexpected uncertainty. We suggest that norepinephrine reports the radical divergence of bottom-up inputs from prevailing top-down interpretations, to influence inference and plasticity. We illustrate this proposal using an adaptive factor analysis model.


Fast Transformation-Invariant Factor Analysis

Neural Information Processing Systems

Dimensionality reduction techniques such as principal component analysis andfactor analysis are used to discover a linear mapping between high dimensional data samples and points in a lower dimensional subspace. In [6], Jojic and Frey introduced mixture of transformation-invariant component analyzers (MTCA) that can account for global transformations suchas translations and rotations, perform clustering and learn local appearance deformations by dimensionality reduction.


An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition

Neural Information Processing Systems

An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to obtain theoptimal state sequence as well as the alignment between the two sequences. One such task, which will be presented in this paper, is multimodal speech recognition usingboth a microphone and a camera recording a speaker simultaneously while he (she) speaks. It is indeed well known that seeing the speaker's face in addition tohearing his (her) voice can often improve speech intelligibility, particularly in noisy environments [7), mainly thanks to the complementarity of the visual and acoustic signals. While in the former solution, the alignment between the two sequences is decided a priori, in the latter, there is no explicit learning of the joint probability of the two sequences. In fact, the model enables to desynchronize the streams by temporarily stretching one of them in order to obtain a better match between the corresponding frames.The model can thus be directly applied to the problem of audiovisual speech recognition where sometimes lips start to move before any sound is heard for instance.


Robust Novelty Detection with Single-Class MPM

Neural Information Processing Systems

This algorithm-the "single-class minimax probability machine(MPM)"- is built on a distribution-free methodology that minimizes the worst-case probability of a data point falling outside of a convex set, given only the mean and covariance matrix of the distribution and making no further distributional assumptions. Wepresent a robust approach to estimating the mean and covariance matrix within the general two-class MPM setting, and show how this approach specializes to the single-class problem. We provide empirical results comparing the single-class MPM to the single-class SVM and a two-class SVM method. 1 Introduction Novelty detection is an important unsupervised learning problem in which test data are to be judged as having been generated from the same or a different process as that which generated the training data.