Technology
Joint Probabilistic Curve Clustering and Alignment
Gaffney, Scott J., Smyth, Padhraic
Clustering and prediction of sets of curves is an important problem in many areas of science and engineering. It is often the case that curves tend to be misaligned from each other in a continuous manner, either in space (across the measurements) or in time. We develop a probabilistic framework that allows for joint clustering and continuous alignment of sets of curves in curve space (as opposed to a fixed-dimensional featurevector space). The proposed methodology integrates new probabilistic alignment models with model-based curve clustering algorithms. The probabilistic approach allows for the derivation of consistent EM learning algorithms for the joint clustering-alignment problem. Experimental results are shown for alignment of human growth data, and joint clustering and alignment of gene expression time-course data.
Implicit Wiener Series for Higher-Order Image Analysis
Franz, Matthias O., Schรถlkopf, Bernhard
The computation of classical higher-order statistics such as higher-order moments or spectra is difficult for images due to the huge number of terms to be estimated and interpreted. We propose an alternative approach in which multiplicative pixel interactions are described by a series of Wiener functionals. Since the functionals are estimated implicitly via polynomial kernels, the combinatorial explosion associated with the classical higher-order statistics is avoided. First results show that image structures such as lines or corners can be predicted correctly, and that pixel interactions up to the order of five play an important role in natural images. Most of the interesting structure in a natural image is characterized by its higher-order statistics.
A Hidden Markov Model for de Novo Peptide Sequencing
Fischer, Bernd, Roth, Volker, Grossmann, Jonas, Baginsky, Sacha, Gruissem, Wilhelm, Roos, Franz, Widmayer, Peter, Buhmann, Joachim M.
De novo Sequencing of peptides is a challenging task in proteome research. While there exist reliable DNAsequencing methods, the highthroughput de novo sequencing of proteins by mass spectrometry is still an open problem. Current approaches suffer from a lack in precision to detect mass peaks in the spectrograms. In this paper we present a novel method for de novo peptide sequencing based on a hidden Markov model. Experiments effectively demonstrate that this new method significantly outperforms standard approaches in matching quality.
On-Chip Compensation of Device-Mismatch Effects in Analog VLSI Neural Networks
Figueroa, Miguel, Bridges, Seth, Diorio, Chris
Device mismatch in VLSI degrades the accuracy of analog arithmetic circuits and lowers the learning performance of large-scale neural networks implemented in this technology. We show compact, low-power on-chip calibration techniques that compensate for device mismatch. Our techniques enable large-scale analog VLSI neural networks with learning performance on the order of 10 bits. We demonstrate our techniques on a 64-synapse linear perceptron learning with the Least-Mean-Squares (LMS) algorithm, and fabricated in a 0.35ยตm CMOS process.
Sampling Methods for Unsupervised Learning
Fergus, Rob, Zisserman, Andrew, Perona, Pietro
We present an algorithm to overcome the local maxima problem in estimating the parameters of mixture models. It combines existing approaches from both EM and a robust fitting algorithm, RANSAC, to give a data-driven stochastic learning scheme. Minimal subsets of data points, sufficient to constrain the parameters of the model, are drawn from proposal densities to discover new regions of high likelihood. The proposal densities are learnt using EM and bias the sampling toward promising solutions. The algorithm is computationally efficient, as well as effective at escaping from local maxima. We compare it with alternative methods, including EM and RANSAC, on both challenging synthetic data and the computer vision problem of alpha-matting.
Learning Hyper-Features for Visual Identification
Ferencz, Andras D., Learned-miller, Erik G., Malik, Jitendra
We address the problem of identifying specific instances of a class (cars) from a set of images all belonging to that class. Although we cannot build a model for any particular instance (as we may be provided with only one "training" example of it), we can use information extracted from observing other members of the class. We pose this task as a learning problem, in which the learner is given image pairs, labeled as matching or not, and must discover which image features are most consistent for matching instances and discriminative for mismatches. We explore a patch based representation, where we model the distributions of similarity measurements defined on the patches. Finally, we describe an algorithm that selects the most salient patches based on a mutual information criterion. This algorithm performs identification well for our challenging dataset of car images, after matching only a few, well chosen patches.
Experts in a Markov Decision Process
Even-dar, Eyal, Kakade, Sham M., Mansour, Yishay
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well can an agent do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which have regret bounds with no dependence on the size of state space. Instead, these bounds depend only on a certain horizon time of the process and logarithmically on the number of actions. We also show that in the case that the dynamics change over time, the problem becomes computationally hard.
Seeing through water
Efros, Alexei, Isler, Volkan, Shi, Jianbo, Visontai, Mirkรณ
We consider the problem of recovering an underwater image distorted by surface waves. A large amount of video data of the distorted image is acquired. The problem is posed in terms of finding an undistorted image patch at each spatial location. This challenging reconstruction task can be formulated as a manifold learning problem, such that the center of the manifold is the image of the undistorted patch. To compute the center, we present a new technique to estimate global distances on the manifold. Our technique achieves robustness through convex flow computations and solves the "leakage" problem inherent in recent manifold embedding techniques.
Making Latin Manuscripts Searchable using gHMM's
Edwards, Jaety, Teh, Yee W., Bock, Roger, Maire, Michael, Vesom, Grace, Forsyth, David A.
We describe a method that can make a scanned, handwritten mediaeval latin manuscript accessible to full text search. A generalized HMM is fitted, using transcribed latin to obtain a transition model and one example each of 22 letters to obtain an emission model. We show results for unigram, bigram and trigram models.
Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units
Doi, Eizaburo, Lewicki, Michael S.
It has been suggested that the primary goal of the sensory system is to represent input in such a way as to reduce the high degree of redundancy. Given a noisy neural representation, however, solely reducing redundancy is not desirable, since redundancy is the only clue to reduce the effects of noise. Here we propose a model that best balances redundancy reduction and redundant representation. Like previous models, our model accounts for the localized and oriented structure of simple cells, but it also predicts a different organization for the population. With noisy, limited-capacity units, the optimal representation becomes an overcomplete, multi-scale representation, which, compared to previous models, is in closer agreement with physiological data. These results offer a new perspective on the expansion of the number of neurons from retina to V1 and provide a theoretical model of incorporating useful redundancy into efficient neural representations.