Goto

Collaborating Authors

 Technology


Speech Modelling Using Subspace and EM Techniques

Neural Information Processing Systems

The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be estimated using an expectation-maximisation (EM) algorithm. One problem is the initialisation of the EM algorithm. Standard initialisation schemes can lead to poor formant trajectories. But these trajectories however are important for vowel intelligibility. The aim of this paper is to investigate the suitability of subspace identification methods to initialise EM. The paper compares the subspace state space system identification (4SID) method with the EM algorithm. The 4SID and EM methods are similar in that they both estimate a state sequence (but using Kalman ters fil and Kalman smoothers respectively), and then estimate parameters (but using least-squares and maximum likelihood respectively).


Online Independent Component Analysis with Local Learning Rate Adaptation

Neural Information Processing Systems

Stochastic meta-descent (SMD) is a new technique for online adaptation of local learning rates in arbitrary twice-differentiable systems. Like matrix momentum it uses full second-order information while retaining O(n) computational complexity by exploiting the efficient computation of Hessian-vector products. Here we apply SMD to independent component analysis, and employ the resulting algorithm for the blind separation of time-varying mixtures. By matching individual learning rates to the rate of change in each source signal's mixture coefficients, our technique is capable of simultaneously tracking sources that move at very different, a priori unknown speeds.


Constrained Hidden Markov Models

Neural Information Processing Systems

By thinking of each state in a hidden Markov model as corresponding to some spatial region of a fictitious topology space it is possible to naturally define neighbouring states as those which are connected in that space. The transition matrix can then be constrained to allow transitions only between neighbours; this means that all valid state sequences correspond to connected paths in the topology space. I show how such constrained HMMs can learn to discover underlying structure in complex sequences of high dimensional data, and apply them to the problem of recovering mouth movements from acoustics in continuous speech.


Broadband Direction-Of-Arrival Estimation Based on Second Order Statistics

Neural Information Processing Systems

N wideband sources recorded using N closely spaced receivers can feasibly be separated based only on second order statistics when using a physical model of the mixing process. In this case we show that the parameter estimation problem can be essentially reduced to considering directions of arrival and attenuations of each signal. The paper presents two demixing methods operating in the time and frequency domain and experimentally shows that it is always possible to demix signals arriving at different angles. Moreover, one can use spatial cues to solve the channel selection problem and a post-processing Wiener filter to ameliorate the artifacts caused by demixing.


Neural System Model of Human Sound Localization

Neural Information Processing Systems

This paper examines the role of biological constraints in the human auditory localization process. A psychophysical and neural system modeling approach was undertaken in which performance comparisons between competing models and a human subject explore the relevant biologically plausible "realism constraints". The directional acoustical cues, upon which sound localization is based, were derived from the human subject's head-related transfer functions (HRTFs). Sound stimuli were generated by convolving bandpass noise with the HRTFs and were presented to both the subject and the model. The input stimuli to the model was processed using the Auditory Image Model of cochlear processing. The cochlear data was then analyzed by a time-delay neural network which integrated temporal and spectral information to determine the spatial location of the sound source.


An Oscillatory Correlation Frame work for Computational Auditory Scene Analysis

Neural Information Processing Systems

A neural model is described which uses oscillatory correlation to segregate speech from interfering sound sources. The core of the model is a two-layer neural oscillator network. A sound stream is represented by a synchronized population of oscillators, and different streams are represented by desynchronized oscillator populations. The model has been evaluated using a corpus of speech mixed with interfering sounds, and produces an improvement in signal-to-noise ratio for every mixture. 1 Introduction Speech is seldom heard in isolation: usually, it is mixed with other environmental sounds. Hence, the auditory system must parse the acoustic mixture reaching the ears in order to retrieve a description of each sound source, a process termed auditory scene analysis (ASA) [2]. Conceptually, ASA may be regarded as a two-stage process.


Bifurcation Analysis of a Silicon Neuron

Neural Information Processing Systems

We have developed a VLSI silicon neuron and a corresponding mathematical model that is a two state-variable system. We describe the circuit implementation and compare the behaviors observed in the silicon neuron and the mathematical model. We also perform bifurcation analysis of the mathematical model by varying the externally applied current and show that the behaviors exhibited by the silicon neuron under corresponding conditions are in good agreement to those predicted by the bifurcation analysis.


A Neuromorphic VLSI System for Modeling the Neural Control of Axial Locomotion

Neural Information Processing Systems

We have developed and tested an analog/digital VLSI system that models the coordination of biological segmental oscillators underlying axial locomotion in animals such as leeches and lampreys. In its current form the system consists of a chain of twelve pattern generating circuits that are capable of arbitrary contralateral inhibitory synaptic coupling. Each pattern generating circuit is implemented with two independent silicon Morris-Lecar neurons with a total of 32 programmable (floating-gate based) inhibitory synapses, and an asynchronous address-event interconnection element that provides synaptic connectivity and implements axonal delay. We describe and analyze the data from a set of experiments exploring the system behavior in terms of synaptic coupling.


A Winner-Take-All Circuit with Controllable Soft Max Property

Neural Information Processing Systems

I describe a silicon network consisting of a group of excitatory neurons and a global inhibitory neuron. The output of the inhibitory neuron is normalized with respect to the input strengths.