Information Technology
Sparse Coding of Natural Images Using an Overcomplete Set of Limited Capacity Units
Doi, Eizaburo, Lewicki, Michael S.
It has been suggested that the primary goal of the sensory system is to represent input in such a way as to reduce the high degree of redundancy. Givena noisy neural representation, however, solely reducing redundancy is not desirable, since redundancy is the only clue to reduce the effects of noise. Here we propose a model that best balances redundancy reductionand redundant representation. Like previous models, our model accounts for the localized and oriented structure of simple cells, but it also predicts a different organization for the population. With noisy, limited-capacity units, the optimal representation becomes an overcomplete, multi-scalerepresentation, which, compared to previous models, is in closer agreement with physiological data. These results offer a new perspective on the expansion of the number of neurons from retina to V1 and provide a theoretical model of incorporating useful redundancy into efficient neural representations.
The Cerebellum Chip: an Analog VLSI Implementation of a Cerebellar Model of Classical Conditioning
Hofstoetter, Constanze, Gil, Manuel, Eng, Kynan, Indiveri, Giacomo, Mintz, Matti, Kramer, Jörg, Verschure, Paul F.
We present a biophysically constrained cerebellar model of classical conditioning, implemented using a neuromorphic analog VLSI (aVLSI) chip. Like its biological counterpart, our cerebellar model is able to control adaptive behavior by predicting the precise timing of events. Here we describe the functionality of the chip and present its learning performance, as evaluated in simulated conditioning experiments at the circuit level and in behavioral experiments using a mobile robot. We show that this aVLSI model supports the acquisition and extinction of adaptively timed conditioned responses under real-world conditions with ultra-low power consumption.
Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization
An auditory "scene", composed of overlapping acoustic sources, can be viewed as a complex object whose constituent parts are the individual sources. Pitch is known to be an important cue for auditory scene analysis. Inthis paper, with the goal of building agents that operate in human environments, we describe a real-time system to identify the presence of one or more voices and compute their pitch. The signal processing in the front end is based on instantaneous frequency estimation, a method for tracking the partials of voiced speech, while the pattern-matching in the back end is based on nonnegative matrix factorization, an unsupervised algorithm for learning the parts of complex objects. While supporting a framework to analyze complicated auditory scenes, our system maintains real-time operability and state-of-the-art performance in clean speech.
Adaptive Discriminative Generative Model and Its Applications
Lin, Ruei-sung, Ross, David A., Lim, Jongwoo, Yang, Ming-Hsuan
This paper presents an adaptive discriminative generative model that generalizes theconventional Fisher Linear Discriminant algorithm and renders a proper probabilistic interpretation. Within the context of object tracking, we aim to find a discriminative generative model that best separates thetarget from the background. We present a computationally efficient algorithm to constantly update this discriminative model as time progresses. While most tracking algorithms operate on the premise that the object appearance or ambient lighting condition does not significantly change as time progresses, our method adapts a discriminative generative modelto reflect appearance variation of the target and background, thereby facilitating the tracking task in ever-changing environments. Numerous experimentsshow that our method is able to learn a discriminative generative model for tracking target objects undergoing large pose and lighting changes.
The power of feature clustering: An application to object detection
We give a fast rejection scheme that is based on image segments and demonstrate it on the canonical example of face detection. However, instead offocusing on the detection step we focus on the rejection step and show that our method is simple and fast to be learned, thus making it an excellent pre-processing step to accelerate standard machine learning classifiers, such as neural-networks, Bayes classifiers or SVM. We decompose acollection of face images into regions of pixels with similar behavior over the image set. The relationships between the mean and variance of image segments are used to form a cascade of rejectors that can reject over 99.8% of image patches, thus only a small fraction of the image patches must be passed to a full-scale classifier. Moreover, the training time for our method is much less than an hour, on a standard PC.
Large-Scale Prediction of Disulphide Bond Connectivity
Cheng, Jianlin, Vullo, Alessandro, Baldi, Pierre F.
The formation of disulphide bridges among cysteines is an important feature ofprotein structures. Here we develop new methods for the prediction ofdisulphide bond connectivity. We first build a large curated data set of proteins containing disulphide bridges and then use 2-Dimensional Recursive Neural Networks to predict bonding probabilities between cysteine pairs.These probabilities in turn lead to a weighted graph matching problem that can be addressed efficiently. We show how the method consistently achievesbetter results than previous approaches on the same validation data. In addition, the method can easily cope with chains with arbitrary numbers of bonded cysteines. Therefore, it overcomes one of the major limitations of previous approaches restricting predictions to chains containing no more than 10 oxidized cysteines. The method can be applied both to situations where the bonded state of each cysteine is known or unknown, in which case bonded state can be predicted with 85% precision and 90% recall. The method also yields an estimate for the total number of disulphide bridges in each chain.
Face Detection --- Efficient and Rank Deficient
Kienzle, Wolf, Franz, Matthias O., Schölkopf, Bernhard, Bakir, Gökhan H.
This paper proposes a method for computing fast approximations to support vectordecision functions in the field of object detection. In the present approach we are building on an existing algorithm where the set of support vectors is replaced by a smaller, so-called reduced set of synthesized inputspace points. In contrast to the existing method that finds the reduced set via unconstrained optimization, we impose a structural constraint on the synthetic points such that the resulting approximations can be evaluated via separable filters. For applications that require scanning largeimages, this decreases the computational complexity by a significant amount.Experimental results show that in face detection, rank deficient approximations are 4 to 6 times faster than unconstrained reduced setsystems.
Nonlinear Blind Source Separation by Integrating Independent Component Analysis and Slow Feature Analysis
Blaschke, Tobias, Wiskott, Laurenz
In contrast to the equivalence of linear blind source separation and linear independent component analysis it is not possible to recover the original sourcesignal from some unknown nonlinear transformations of the sources using only the independence assumption. Integrating the objectives ofstatistical independence and temporal slowness removes this indeterminacy leading to a new method for nonlinear blind source separation. Theprinciple of temporal slowness is adopted from slow feature analysis, an unsupervised method to extract slowly varying features from a given observed vectorial signal. The performance of the algorithm is demonstrated on nonlinearly mixed speech data.
Joint Tracking of Pose, Expression, and Texture using Conditionally Gaussian Filters
Marks, Tim K., Roddey, J. C., Movellan, Javier R., Hershey, John R.
We present a generative model and stochastic filtering algorithm for simultaneous trackingof 3D position and orientation, nonrigid motion, object texture, and background texture using a single camera. We show that the solution to this problem is formally equivalent to stochastic filtering ofconditionally Gaussian processes, a problem for which well known approaches exist [3, 8]. We propose an approach based on Monte Carlo sampling of the nonlinear component of the process (object motion) andexact filtering of the object and background textures given the sampled motion. The smoothness of image sequences in time and space is exploited by using Laplace's method to generate proposal distributions for importance sampling [7]. The resulting inference algorithm encompasses bothoptic flow and template-based tracking as special cases, and elucidates the conditions under which these methods are optimal. We demonstrate an application of the system to 3D nonrigid face tracking.