Goto

Collaborating Authors

 Undirected Networks


Dynamic Time-Alignment Kernel in Support Vector Machine

Neural Information Processing Systems

A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of nonlinear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classification algorithms can be employed without further modifications. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).


Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Neural Information Processing Systems

Hinton Gatsby Unit, UCL London, UK WCIN 3AR hinton@gatsby.ucl.ac.uk Abstract Logistic units in the first hidden layer of a feedforward neural network computethe relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's. 1 Introduction A standard way of performing classification using a generative model is to divide the training cases into their respective classes and then train a set of class conditional models. This unsupervised approach to classification is appealing for two reasons.


Efficient Resources Allocation for Markov Decision Processes

Neural Information Processing Systems

Assume that we model a complex decision-making problem under uncertainty by a finite MDP. Because of the limited resources used, the parameters of the MDP (transition probabilities and rewards) are uncertain: we assume that we only know a belief state over their possible values. IT we select the most probable values of the parameters, we can build a MDP and solve it to deduce the corresponding optimal policy. However, because of the uncertainty over the true parameters, this policy may not be the one that maximizes the expected cumulative rewards of the true (but partially unknown) decision-making problem. We can nevertheless use sampling techniques to estimate the expected loss of using this policy.


Predictive Representations of State

Neural Information Processing Systems

We show that states of a dynamical system can be usefully represented bymulti-step, action-conditional predictions of future observations. Staterepresentations that are grounded in data in this way may be easier to learn, generalize better, and be less dependent onaccurate prior models than, for example, POMDP state representations. Building on prior work by Jaeger and by Rivest and Schapire, in this paper we compare and contrast a linear specialization ofthe predictive approach with the state representations used in POMDPs and in k-order Markov models. Ours is the first specific formulation of the predictive idea that includes both stochasticity and actions (controls). We show that any system has a linear predictive state representation with number of predictions no greater than the number of states in its minimal POMDP model.


Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

Neural Information Processing Systems

We consider the use of two additive control variate methods to reduce the variance of performance gradient estimates in reinforcement learning problems.The first approach we consider is the baseline method, in which a function of the current state is added to the discounted value estimate. We relate the performance of these methods, which use sample paths,to the variance of estimates based on iid data. We derive the baseline function that minimizes this variance, and we show that the variance forany baseline is the sum of the optimal variance and a weighted squared distance to the optimal baseline. We show that the widely used average discounted value baseline (where the reward is replaced by the difference between the reward and its expectation) is suboptimal. The second approach we consider is the actor-critic method, which uses an approximate value function. We give bounds on the expected squared error of its estimates. We show that minimizing distance to the true value function is suboptimal in general; we provide an example for which the true value function gives an estimate with positive variance, but the optimal valuefunction gives an unbiased estimate with zero variance. Our bounds suggest algorithms to estimate the gradient of the performance of parameterized baseline or value functions.


Reinforcement Learning with Long Short-Term Memory

Neural Information Processing Systems

This paper presents reinforcement learning with a Long Short Term Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies between relevantevents. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task. 1 Introduction Reinforcement learning (RL) is a way of learning how to behave based on delayed reward signals [12]. Among the more important challenges for RL are tasks where part of the state of the environment is hidden from the agent. Such tasks are called non-Markovian tasks or Partially Observable Markov Decision Processes. Many real world tasks have this problem of hidden state. For instance, in a navigation task different positions in the environment may look the same, but one and the same action may lead to different next states or rewards. Thus, hidden state makes RL more realistic.


A Bayesian Network for Real-Time Musical Accompaniment

Neural Information Processing Systems

We describe a computer system that provides a real-time musical accompanimentfor a live soloist in a piece of non-improvised music for soloist and accompaniment. A Bayesian network is developed thatrepresents the joint distribution on the times at which the solo and accompaniment notes are played, relating the two parts through a layer of hidden variables. The network is first constructed usingthe rhythmic information contained in the musical score. The network is then trained to capture the musical interpretations ofthe soloist and accompanist in an off-line rehearsal phase. During live accompaniment the learned distribution of the network is combined with a real-time analysis of the soloist's acoustic signal, performedwith a hidden Markov model, to generate a musically principledaccompaniment that respects all available sources of knowledge. A live demonstration will be provided.



Speech Recognition with Missing Data using Recurrent Neural Nets

Neural Information Processing Systems

In the'missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regionswhich are dominated by the speech source. The remaining regions are considered to be'missing'. In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classification with incomplete data and imputing missing values interact. We report encouraging results on an isolated digit recognition task.


Audio-Visual Sound Separation Via Hidden Markov Models

Neural Information Processing Systems

It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests theutility of audiovisual information for the task of speech enhancement. We propose a method to exploit audiovisual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori allycombined, to incorporate visual lip information and employ novelsignal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosionin the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audiovisual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information.