Sensing and Signal Processing
Stagewise Processing in Error-correcting Codes and Image Restoration
Wong, K. Y. Michael, Nishimori, Hidetoshi
We introduce stagewise processing in error-correcting codes and image restoration, by extracting information from the former stage and using it selectively to improve the performance of the latter one. Both mean-field analysis using the cavity method and simulations show that it has the advantage of being robust against uncertainties in hyperparameter estimation. 1 Introduction In error-correcting codes [1] and image restoration [2], the choice of the so-called hyperparameters is an important factor in determining their performances. Hyperparameters refer to the coefficients weighing the biases and variances of the tasks. In error correction, they determine the statistical significance given to the paritychecking terms and the received bits. Similarly in image restoration, they determine the statistical weights given to the prior knowledge and the received data.
Partially Observable SDE Models for Image Sequence Recognition Tasks
Movellan, Javier R., Mineiro, Paul, Williams, Ruth J.
This paper explores a framework for recognition of image sequences using partially observable stochastic differential equation (SDE) models. Monte-Carlo importance sampling techniques are used for efficient estimation of sequence likelihoods and sequence likelihood gradients. Once the network dynamics are learned, we apply the SDE models to sequence recognition tasks in a manner similar to the way Hidden Markov models (HMMs) are commonly applied. The potential advantage of SDEs over HMMS is the use of continuous state dynamics. We present encouraging results for a video sequence recognition task in which SDE models provided excellent performance when compared to hidden Markov models. 1 Introduction This paper explores a framework for recognition of image sequences using partially observable stochastic differential equations (SDEs). In particular we use SDE models of low-power nonlinear RC circuits with a significant thermal noise component. We call them diffusion networks. A diffusion network consists of a set of n nodes coupled via a vector of adaptive impedance parameters ' which are tuned to optimize the network's behavior.
From Mixtures of Mixtures to Adaptive Transform Coding
Archer, Cynthia, Leen, Todd K.
We establish a principled framework for adaptive transform coding. Transform coders are often constructed by concatenating an ad hoc choice of transform with suboptimal bit allocation and quantizer design. Instead, we start from a probabilistic latent variable model in the form of a mixture of constrained Gaussian mixtures. From this model we derive a transform coding algorithm, which is a constrained version of the generalized Lloyd algorithm for vector quantizer design. A byproduct of our derivation is the introduction of a new transform basis, which unlike other transforms (PCA, DCT, etc.) is explicitly optimized for coding.
A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
Gray, Michael S., Sejnowski, Terrence J., Movellan, Javier R.
These methods are compared on their performance on a visual speech recognition task. While the representations developed are specific to visual speech recognition, the methods themselves are general purpose and applicable to other tasks. Our focus is on low-level data-driven methods based on the statistical properties of relatively untouched images, as opposed to approaches that work with contours or highly processed versions of the image. Padgett [8] and Bartlett [1] systematically studied statistical methods for developing representations on expression recognition tasks. They found that local wavelet-like representations consistently outperformed global representations, like eigenfaces. In this paper we also compare local versus global representations.
Partially Observable SDE Models for Image Sequence Recognition Tasks
Movellan, Javier R., Mineiro, Paul, Williams, Ruth J.
This paper explores a framework for recognition of image sequences using partially observable stochastic differential equation (SDE) models. Monte-Carlo importance sampling techniques are used for efficient estimation of sequence likelihoods and sequence likelihood gradients. Once the network dynamics are learned, we apply the SDE models to sequence recognition tasks in a manner similar to the way Hidden Markov models (HMMs) are commonly applied. The potential advantage of SDEs over HMMS is the use of continuous state dynamics. We present encouraging results for a video sequence recognition task in which SDE models provided excellent performance when compared to hidden Markov models. 1 Introduction This paper explores a framework for recognition of image sequences using partially observable stochastic differential equations (SDEs). In particular we use SDE models of low-power nonlinear RC circuits with a significant thermal noise component. We call them diffusion networks. A diffusion network consists of a set of n nodes coupled via a vector of adaptive impedance parameters ' which are tuned to optimize the network's behavior.
FaceSync: A Linear Operator for Measuring Synchronization of Video Facial Images and Audio Tracks
Slaney, Malcolm, Covell, Michele
FaceSync is an optimal linear algorithm that finds the degree of synchronization between the audio and image recordings of a human speaker. Using canonical correlation, it finds the best direction to combine all the audio and image data, projecting them onto a single axis. FaceSync uses Pearson's correlation to measure the degree of synchronization between the audio and image data. We derive the optimal linear transform to combine the audio and visual information and describe an implementation that avoids the numerical problems caused by computing the correlation matrices.
Learning Sparse Image Codes using a Wavelet Pyramid Architecture
Olshausen, Bruno A., Sallee, Phil, Lewicki, Michael S.
We show how a wavelet basis may be adapted to best represent natural images in terms of sparse coefficients. The wavelet basis, which may be either complete or overcomplete, is specified by a small number of spatial functions which are repeated across space and combined in a recursive fashion so as to be self-similar across scale. These functions are adapted to minimize the estimated code length under a model that assumes images are composed of a linear superposition of sparse, independent components. When adapted to natural images, the wavelet bases take on different orientations and they evenly tile the orientation domain, in stark contrast to the standard, non-oriented wavelet bases used in image compression. When the basis set is allowed to be overcomplete, it also yields higher coding efficiency than standard wavelet bases. 1 Introduction The general problem we address here is that of learning efficient codes for representing natural images.
A Comparison of Image Processing Techniques for Visual Speech Recognition Applications
Gray, Michael S., Sejnowski, Terrence J., Movellan, Javier R.
These methods are compared on their performance on a visual speech recognition task. While the representations developed are specific to visual speech recognition, the methods themselvesare general purpose and applicable to other tasks. Our focus is on low-level data-driven methods based on the statistical properties of relatively untouched images, as opposed to approaches that work with contours or highly processed versions of the image. Padgett [8] and Bartlett [1] systematically studied statistical methods for developing representations on expression recognition tasks. They found that local wavelet-like representations consistently outperformed global representations, like eigenfaces. In this paper we also compare local versus global representations.
Higher-Order Statistical Properties Arising from the Non-Stationarity of Natural Signals
Parra, Lucas C., Spence, Clay, Sajda, Paul
The first is that a variety of natural signals can be related through a common modelof spherically invariant random processes, which have the attractive property that the joint densities can be constructed from the one dimensional marginal. The second is that in some cases thenon-stationarity assumption and only second order methods can be explicitly exploited to find a linear basis that is equivalent to independent components obtained with higher-order methods. This is demonstrated on spectro-temporal components of speech. 1 Introduction Recently, considerable attention has been paid to understanding and modeling the non-Gaussian or "higher-order" properties of natural signals, particularly images. Several non-Gaussian properties have been identified and studied. For example, marginal densities of features have been shown to have high kurtosis or "heavy tails", indicating a non-Gaussian, sparse representation. Another example is the "bowtie" shape of conditional distributions of neighboring features, indicating dependence ofvariances [11]. These non-Gaussian properties have motivated a number of image and signal processing algorithms that attempt to exploit higher-order s tatistics of the signals, e.g., for blind source separation. In this paper we show that these previously observed higher-order phenomena are ubiquitous and can be accounted for by a model which simply varies the scale of an otherwise stationary Gaussianprocess. This enables us to relate a variety of natural signals to one another and to spherically invariant random processes, which are well-known in the signal processing literature [6, 3].