Country
Ranking on Data Manifolds
Zhou, Dengyong, Weston, Jason, Gretton, Arthur, Bousquet, Olivier, Schölkopf, Bernhard
The Google search engine has enjoyed huge success with its web page ranking algorithm, which exploits global, rather than local, hyperlink structure of the web using random walks. Here we propose a simple universal ranking algorithm for data lying in the Euclidean space, such as text or image data. The core idea of our method is to rank the data with respect to the intrinsic manifold structure collectively revealed by a great amount of data. Encouraging experimental results from synthetic, image, and text data illustrate the validity of our method.
A Classification-based Cocktail-party Processor
Roman, Nicoleta, Wang, Deliang, Brown, Guy J.
At a cocktail party, a listener can selectively attend to a single voice and filter out other acoustical interferences. How to simulate this perceptual ability remains a great challenge. This paper describes a novel supervised learning approach to speech segregation, in which a target speech signal is separated from interfering sounds using spatial location cues: interaural time differences (ITD) and interaural intensity differences (IID). Motivated by the auditory masking effect, we employ the notion of an ideal time-frequency binary mask, which selects the target if it is stronger than the interference in a local time-frequency unit. Within a narrow frequency band, modifications to the relative strength of the target source with respect to the interference trigger systematic changes for estimated ITD and IID.
An Infinity-sample Theory for Multi-category Large Margin Classification
The purpose of this paper is to investigate infinity-sample properties of risk minimization based multi-category classification methods. These methods can be considered as natural extensions to binary large margin classification. We establish conditions that guarantee the infinity-sample consistency of classifiers obtained in the risk minimization framework. Examples are provided for two specific forms of the general formulation, which extend a number of known methods. Using these examples, we show that some risk minimization formulations can also be used to obtain conditionalprobability estimates for the underlying problem. Such conditional probability information will be useful for statistical inferencing tasksbeyond classification.
Automatic Annotation of Everyday Movements
Ramanan, Deva, Forsyth, David A.
This paper describes a system that can annotate a video sequence with: a description of the appearance of each actor; when the actor is in view; and a representation of the actor's activity while in view. The system does not require a fixed background, and is automatic. The system works by (1) tracking people in 2D and then, using an annotated motion capture dataset, (2) synthesizing an annotated 3D motion sequence matching the 2D tracks. The 3D motion capture data is manually annotated off-line using a class structure that describes everyday motions and allows motion annotationsto be composed -- one may jump while running, for example. Descriptions computed from video of real motions show that the method is accurate.
Learning with Local and Global Consistency
Zhou, Dengyong, Bousquet, Olivier, Lal, Thomas N., Weston, Jason, Schölkopf, Bernhard
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. Aprincipled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problemsand demonstrates effective use of unlabeled data.
Learning a Distance Metric from Relative Comparisons
Schultz, Matthew, Joachims, Thorsten
This paper presents a method for learning a distance metric from relative comparisonsuch as "A is closer to B than A is to C". Taking a Support Vector Machine (SVM) approach, we develop an algorithm that provides a flexible way of describing qualitative training data as a set of constraints. We show that such constraints lead to a convex quadratic programming problem that can be solved by adapting standard methods forSVM training. We empirically evaluate the performance and the modelling flexibility of the algorithm on a collection of text documents.
Approximate Analytical Bootstrap Averages for Support Vector Classifiers
Malzahn, Dörthe, Opper, Manfred
We compute approximate analytical bootstrap averages for support vector classificationusing a combination of the replica method of statistical physics and the TAP approach for approximate inference. We test our method on a few datasets and compare it with exact averages obtained by extensive Monte-Carlo sampling.
Generalised Propagation for Fast Fourier Transforms with Partial or Missing Data
Discrete Fourier transforms and other related Fourier methods have been practically implementable due to the fast Fourier transform (FFT). However there are many situations where doing fast Fourier transforms without complete data would be desirable. In this paper itis recognised that formulating the FFT algorithm as a belief network allows suitable priors to be set for the Fourier coefficients. Furthermore efficient generalised belief propagation methods between clustersof four nodes enable the Fourier coefficients to be inferred and the missing data to be estimated in near to O(n log n) time, where n is the total of the given and missing data points. This method is compared with a number of common approaches such as setting missing data to zero or to interpolation. It is tested on generated data and for a Fourier analysis of a damaged audio signal.