Goto

Collaborating Authors

 Industry


Noise Suppression Based on Neurophysiologically-motivated SNR Estimation for Robust Speech Recognition

Neural Information Processing Systems

For SNR-estimation, the input signal is transformed into so-called Amplitude Modulation Spectrograms (AMS), which represent both spectral and temporal characteristics of the respective analysis frame, and which imitate the representation of modulation frequencies in higher stages of the mammalian auditory system. A neural network is used to analyse AMS patterns generated from noisy speech and estimates the local SNR. Noise suppression is achieved by attenuating frequency channels according to their SNR. The noise suppression algorithm is evaluated in speakerindependent digit recognition experiments and compared to noise suppression by Spectral Subtraction. 1 Introduction One of the major problems in automatic speech recognition (ASR) systems is their lack of robustness in noise, which severely degrades their usefulness in many practical applications. Several proposals have been made to increase the robustness of ASR systems, e.g. by model compensation or more noise-robust feature extraction [1, 2]. Another method to increase robustness of ASR systems is to suppress the background noise before feature extraction. Classical approaches for single-channel noise suppression are Spectral Subtraction [3] and related schemes, e.g.


On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems

Neural Information Processing Systems

Our al exploits the special structure of the sum of squared error measure in Equation (1); hence, the other objective functions are outside the scope of this paper. The gradient vector and Hessian matrix are given by g g(9) JT rand H H(9) JT J S, where J is the m x n Jacobian matrix of r, and S denotes the matrix of second-derivative terms. If S is simply omitted based on the "small residual" assumption, then the Hessian matrix reduces to the Gauss-Newton model Hessian: i.e., JT J. Furthermore, a family of quasi-Newton methods can be applied to approximate term S alone, leading to the augmented Gauss-Newton model Hessian (see, for example, Mizutani [2] and references therein).


A New Model of Spatial Representation in Multimodal Brain Areas

Neural Information Processing Systems

Most models of spatial representations in the cortex assume cells with limited receptive fields that are defined in a particular egocentric frame of reference. However, cells outside of primary sensory cortex are either gain modulated by postural input or partially shifting. We show that solving classical spatial tasks, like sensory prediction, multi-sensory integration, sensory-motor transformation and motor control requires more complicated intermediate representations that are not invariant in one frame of reference. We present an iterative basis function map that performs these spatial tasks optimally with gain modulated and partially shifting units, and tests it against neurophysiological and neuropsychological data. In order to perform an action directed toward an object, it is necessary to have a representation of its spatial location.


The Interplay of Symbolic and Subsymbolic Processes in Anagram Problem Solving

Neural Information Processing Systems

Although connectionist models have provided insights into the nature of perception and motor control, connectionist accounts of higher cognition seldom go beyond an implementation of traditional symbol-processing theories. We describe a connectionist constraint satisfaction model of how people solve anagram problems. The model exploits statistics of English orthography, but also addresses the interplay of sub symbolic and symbolic computation by a mechanism that extracts approximate symbolic representations (partial orderings of letters) from sub symbolic structures and injects the extracted representation back into the model to assist in the solution of the anagram. We show the computational benefit of this extraction-injection process and discuss its relationship to conscious mental processes and working memory. We also account for experimental data concerning the difficulty of anagram solution based on the orthographic structure of the anagram string and the target word.


Discovering Hidden Variables: A Structure-Based Approach

Neural Information Processing Systems

A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. As such, they induce seemingly complex dependencies among the latter. In recent years, much attention has been devoted to the development of algorithms for learning parameters, and in some cases structure, in the presence of hidden variables. In this paper, we address the related problem of detecting hidden variables that interact with the observed variables.


Dendritic Compartmentalization Could Underlie Competition and Attentional Biasing of Simultaneous Visual Stimuli

Neural Information Processing Systems

Neurons in area V4 have relatively large receptive fields (RFs), so multiple visual features are simultaneously "seen" by these cells. Recordings from single V 4 neurons suggest that simultaneously presented stimuli compete to set the output firing rate, and that attention acts to isolate individual features by biasing the competition in favor of the attended object. We propose that both stimulus competition and attentional biasing arise from the spatial segregation of afferent synapses onto different regions of the excitable dendritic tree of V 4 neurons. The pattern of feedforward, stimulus-driven inputs follows from a Hebbian rule: excitatory afferents with similar RFs tend to group together on the dendritic tree, avoiding randomly located inhibitory inputs with similar RFs. The same principle guides the formation of inputs that mediate attentional modulation.


Universality and Individuality in a Neural Code

Neural Information Processing Systems

This basic question in the theory of knowledge seems to be beyond the scope of experimental investigation. An accessible version of this question is whether different observers of the same sense data have the same neural representation of these data: how much of the neural code is universal, and how much is individual? Differences in the neural codes of different individuals may arise from various sources: First, different individuals may use different'vocabularies' of coding symbols. Second, they may use the same symbols to encode different stimulus features.


Natural Sound Statistics and Divisive Normalization in the Auditory System

Neural Information Processing Systems

We explore the statistical properties of natural sound stimuli preprocessed witha bank of linear filters. The responses of such filters exhibit a striking form of statistical dependency, in which the response variance of each filter grows with the response amplitude of filters tuned for nearby frequencies. These dependencies may be substantially reduced usingan operation known as divisive normalization, in which the response of each filter is divided by a weighted sum of the rectified responsesof other filters. The weights may be chosen to maximize the independence of the normalized responses for an ensemble of natural sounds.We demonstrate that the resulting model accounts for nonlinearities inthe response characteristics of the auditory nerve, by comparing model simulations to electrophysiological recordings. In previous work (NIPS, 1998) we demonstrated that an analogous model derived from the statistics of natural images accounts for nonlinear properties of neurons in primary visual cortex.


Reinforcement Learning with Function Approximation Converges to a Region

Neural Information Processing Systems

Many algorithms for approximate reinforcement learning are not known to converge. In fact, there are counterexamples showing that the adjustable weights in some algorithms may oscillate within a region rather than converging to a point. This paper shows that, for two popular algorithms, such oscillation is the worst that can happen: the weights cannot diverge, but instead must converge to a bounded region. The algorithms are SARSA(O) and V(O); the latter algorithm was used in the well-known TD-Gammon program. 1 Introduction Although there are convergent online algorithms (such as TD()') [1]) for learning the parameters of a linear approximation to the value function of a Markov process, no way is known to extend these convergence proofs to the task of online approximation ofeither the state-value (V*) or the action-value (Q*) function of a general Markov decision process. In fact, there are known counterexamples to many proposed algorithms.For example, fitted value iteration can diverge even for Markov processes [2]; Q-Iearning with linear function approximators can diverge, even when the states are updated according to a fixed update policy [3]; and SARSA(O) can oscillate between multiple policies with different value functions [4].