Europe
Implicit Wiener Series for Higher-Order Image Analysis
Franz, Matthias O., Schölkopf, Bernhard
The computation of classical higher-order statistics such as higher-order moments or spectra is difficult for images due to the huge number of terms to be estimated and interpreted. We propose an alternative approach in which multiplicative pixel interactions are described by a series of Wiener functionals. Since the functionals are estimated implicitly via polynomial kernels, the combinatorial explosion associated with the classical higher-order statistics is avoided. First results show that image structures such as lines or corners can be predicted correctly, and that pixel interactions up to the order of five play an important role in natural images. Most of the interesting structure in a natural image is characterized by its higher-order statistics.
A Hidden Markov Model for de Novo Peptide Sequencing
Fischer, Bernd, Roth, Volker, Grossmann, Jonas, Baginsky, Sacha, Gruissem, Wilhelm, Roos, Franz, Widmayer, Peter, Buhmann, Joachim M.
De novo Sequencing of peptides is a challenging task in proteome research. While there exist reliable DNAsequencing methods, the highthroughput de novo sequencing of proteins by mass spectrometry is still an open problem. Current approaches suffer from a lack in precision to detect mass peaks in the spectrograms. In this paper we present a novel method for de novo peptide sequencing based on a hidden Markov model. Experiments effectively demonstrate that this new method significantly outperforms standard approaches in matching quality.
Sampling Methods for Unsupervised Learning
Fergus, Rob, Zisserman, Andrew, Perona, Pietro
We present an algorithm to overcome the local maxima problem in estimating the parameters of mixture models. It combines existing approaches from both EM and a robust fitting algorithm, RANSAC, to give a data-driven stochastic learning scheme. Minimal subsets of data points, sufficient to constrain the parameters of the model, are drawn from proposal densities to discover new regions of high likelihood. The proposal densities are learnt using EM and bias the sampling toward promising solutions. The algorithm is computationally efficient, as well as effective at escaping from local maxima. We compare it with alternative methods, including EM and RANSAC, on both challenging synthetic data and the computer vision problem of alpha-matting.
Seeing through water
Efros, Alexei, Isler, Volkan, Shi, Jianbo, Visontai, Mirkó
We consider the problem of recovering an underwater image distorted by surface waves. A large amount of video data of the distorted image is acquired. The problem is posed in terms of finding an undistorted image patch at each spatial location. This challenging reconstruction task can be formulated as a manifold learning problem, such that the center of the manifold is the image of the undistorted patch. To compute the center, we present a new technique to estimate global distances on the manifold. Our technique achieves robustness through convex flow computations and solves the "leakage" problem inherent in recent manifold embedding techniques.
Making Latin Manuscripts Searchable using gHMM's
Edwards, Jaety, Teh, Yee W., Bock, Roger, Maire, Michael, Vesom, Grace, Forsyth, David A.
We describe a method that can make a scanned, handwritten mediaeval latin manuscript accessible to full text search. A generalized HMM is fitted, using transcribed latin to obtain a transition model and one example each of 22 letters to obtain an emission model. We show results for unigram, bigram and trigram models.
Triangle Fixing Algorithms for the Metric Nearness Problem
Sra, Suvrit, Tropp, Joel, Dhillon, Inderjit S.
Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the "nearest" matrix of distances that satisfy the triangle inequalities.
Bayesian inference in spiking neurons
We propose a new interpretation of spiking neurons as Bayesian integrators accumulating evidence over time about events in the external world or the body, and communicating to other neurons their certainties about these events. In this model, spikes signal the occurrence of new information, i.e. what cannot be predicted from the past activity. As a result, firing statistics are close to Poisson, albeit providing a deterministic representation of probabilities. We proceed to develop a theory of Bayesian inference in spiking neural networks, recurrent interactions implementing a variant of belief propagation. Many perceptual and motor tasks performed by the central nervous system are probabilistic, and can be described in a Bayesian framework [4, 3].
The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees
Dekel, Ofer, Shalev-shwartz, Shai, Singer, Yoram
Prediction suffix trees (PST) provide a popular and effective tool for tasks such as compression, classification, and language modeling. In this paper we take a decision theoretic view of PSTs for the task of sequence prediction. Generalizing the notion of margin to PSTs, we present an online PST learning algorithm and derive a loss bound for it. The depth of the PST generated by this algorithm scales linearly with the length of the input. We then describe a self-bounded enhancement of our learning algorithm which automatically grows a bounded-depth PST. We also prove an analogous mistake-bound for the self-bounded algorithm. The result is an efficient algorithm that neither relies on a-priori assumptions on the shape or maximal depth of the target PST nor does it require any parameters. To our knowledge, this is the first provably-correct PST learning algorithm which generates a bounded-depth PST while being competitive with any fixed PST determined in hindsight.
Semigroup Kernels on Finite Sets
Cuturi, Marco, Vert, Jean-philippe
Complex objects can often be conveniently represented by finite sets of simpler components, such as images by sets of patches or texts by bags of words. We study the class of positive definite (p.d.) kernels for two such objects that can be expressed as a function of the merger of their respective sets of components. We prove a general integral representation of such kernels and present two particular examples. One of them leads to a kernel for sets of points living in a space endowed itself with a positive definite kernel. We provide experimental results on a benchmark experiment of handwritten digits image classification which illustrate the validity of the approach.
Trait Selection for Assessing Beef Meat Quality Using Non-linear SVM
Coz, Juan, Bayón, Gustavo F., Díez, Jorge, Luaces, Oscar, Bahamonde, Antonio, Sañudo, Carlos
In this paper we show that it is possible to model sensory impressions of consumers about beef meat. This is not a straightforward task; the reason is that when we are aiming to induce a function that maps object descriptions into ratings, we must consider that consumers' ratings are just a way to express their preferences about the products presented in the same testing session. Therefore, we had to use a special purpose SVM polynomial kernel. The training data set used collects the ratings of panels of experts and consumers; the meat was provided by 103 bovines of 7 Spanish breeds with different carcass weights and aging periods. Additionally, to gain insight into consumer preferences, we used feature subset selection tools. The result is that aging is the most important trait for improving consumers' appreciation of beef meat.