Goto

Collaborating Authors

 Neural Information Processing Systems


Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra

Neural Information Processing Systems

A system has been developed to extract diagnostic information from jet engine carcass vibration data. Support Vector Machines applied to novelty detectionprovide a measure of how unusual the shape of a vibration signatureis, by learning a representation of normality. We describe a novel method for Support Vector Machines of including information from a second class for novelty detection and give results from the application toJet Engine vibration analysis.


Learning and Tracking Cyclic Human Motion

Neural Information Processing Systems

We estimate a statistical model of typical activities from a large set of 3D periodic human motion data by segmenting these data automatically into "cycles". Then the mean and the principal componentsof the cycles are computed using a new algorithm that accounts for missing information and enforces smooth transitions betweencycles. The learned temporal model provides a prior probability distribution over human motions that can be used in a Bayesian framework for tracking human subjects in complex monocular video sequences and recovering their 3D motion. 1 Introduction The modeling and tracking of human motion in video is important for problems as varied as animation, video database search, sports medicine, and human-computer interaction. Technically, the human body can be approximated by a collection of articulated limbs and its motion can be thought of as a collection of time-series describing the joint angles as they evolve over time. A key challenge in modeling these joint angles involves decomposing the time-series into suitable temporal primitives.


A Mathematical Programming Approach to the Kernel Fisher Algorithm

Neural Information Processing Systems

We investigate a new kernel-based classifier: the Kernel Fisher Discriminant (KFD).A mathematical programming formulation based on the observation thatKFD maximizes the average margin permits an interesting modification of the original KFD algorithm yielding the sparse KFD. We find that both, KFD and the proposed sparse KFD, can be understood in an unifying probabilistic context. Furthermore, we show connections to Support Vector Machines and Relevance Vector Machines. From this understanding, we are able to outline an interesting kernel-regression technique based upon the KFD algorithm.


Large Scale Bayes Point Machines

Neural Information Processing Systems

The concept of averaging over classifiers is fundamental to the Bayesian analysis of learning. Based on this viewpoint, it has recently beendemonstrated for linear classifiers that the centre of mass of version space (the set of all classifiers consistent with the training set) - also known as the Bayes point - exhibits excellent generalisationabilities. In this paper we present a method based on the simple perceptron learning algorithm which allows to overcome this algorithmic drawback. The method is algorithmically simpleand is easily extended to the multi-class case. We present experimental results on the MNIST data set of handwritten digitswhich show that Bayes point machines (BPMs) are competitive with the current world champion, the support vector machine.


A Comparison of Image Processing Techniques for Visual Speech Recognition Applications

Neural Information Processing Systems

These methods are compared on their performance on a visual speech recognition task. While the representations developed are specific to visual speech recognition, the methods themselvesare general purpose and applicable to other tasks. Our focus is on low-level data-driven methods based on the statistical properties of relatively untouched images, as opposed to approaches that work with contours or highly processed versions of the image. Padgett [8] and Bartlett [1] systematically studied statistical methods for developing representations on expression recognition tasks. They found that local wavelet-like representations consistently outperformed global representations, like eigenfaces. In this paper we also compare local versus global representations.


Algorithmic Stability and Generalization Performance

Neural Information Processing Systems

A stable learner is one for which the learned solution does not change much with small changes in the training set. The bounds we obtain do not depend on any measure of the complexity of the hypothesis space (e.g. VC dimension) but rather depend on how the learning algorithm searches this space, and can thus be applied even when the VC dimension is infinite. We demonstrate that regularization networks possess the required stability property and apply our method to obtain new bounds on their generalization performance.


Factored Semi-Tied Covariance Matrices

Neural Information Processing Systems

A new form of covariance modelling for Gaussian mixture models and hidden Markov models is presented. This is an extension to an efficient form of covariance modelling used in speech recognition, semi-tied covariance matrices.In the standard form of semi-tied covariance matrices the covariance matrix is decomposed into a highly shared decorrelating transform and a component-specific diagonal covariance matrix. The use of a factored decorrelating transform is presented in this paper. This factoring effectivelyincreases the number of possible transforms without increasing thenumber of free parameters. Maximum likelihood estimation schemes for all the model parameters are presented including the component/transform assignment,transform and component parameters. This new model form is evaluated on a large vocabulary speech recognition task. It is shown that using this factored form of covariance modelling reduces the word error rate. 1 Introduction A standard problem in machine learning is to how to efficiently model correlations in multidimensional data.Solutions should be efficient both in terms of number of model parameters and cost of the likelihood calculation. For speech recognition this is particularly important due to the large number of Gaussian components used, typically in the tens of thousands, and the relatively large dimensionality of the data, typically 30-60.


Interactive Parts Model: An Application to Recognition of On-line Cursive Script

Neural Information Processing Systems

In this work, we introduce an Interactive Parts (IP) model as an alternative to Hidden Markov Models (HMMs). We tested both models on a database of online cursive script. We show that implementations ofHMMs and the IP model, in which all letters are assumed to have the same average width, give comparable results. However, in contrast to HMMs, the IP model can handle duration modeling without an increase in computational complexity. 1 Introduction Hidden Markov models [9] have been a dominant paradigm in speech and handwriting recognitionover the past several decades. The success of HMMs is primarily due to their ability to model the statistical and sequential nature of speech and handwriting data.


Text Classification using String Kernels

Neural Information Processing Systems

A subsequence is any ordered sequence ofk characters occurring in the text though not necessarily contiguously. A direct computation ofthis feature vector would involve a prohibitive amount of computation even for modest values of k, since the dimension of the feature space grows exponentially with k. The paper describes how despite this fact the inner product can be efficiently evaluated by a dynamic programming technique. An effective alternative to explicit feature extraction is provided by kernel methods. The learning then takes place in the feature space, provided the learning algorithm can be entirely rewritten so that the data points only appear inside dot products with other data points.


Occam's Razor

Neural Information Processing Systems

The Bayesian paradigm apparently only sometimes gives rise to Occam's Razor; at other times very large models perform well. We give simple examples of both kinds of behaviour. The two views are reconciled when measuring complexity of functions, rather than of the machinery used to implement them. We analyze the complexity of functions for some linear in the parameter models that are equivalent to Gaussian Processes, and always find Occam's Razor at work. 1 Introduction Occam's Razor is a well known principle of "parsimony of explanations" which is influential inscientific thinking in general and in problems of statistical inference in particular. In this paper we review its consequences for Bayesian statistical models, where its behaviour can be easily demonstrated and quantified.