AITopics

We show that states of a dynamical system can be usefully represented by multi-step, action-conditional predictions of future observations. State representations that are grounded in data in this way may be easier to learn, generalize better, and be less dependent on accurate prior models than, for example, POMDP state representations. Building on prior work by Jaeger and by Rivest and Schapire, in this paper we compare and contrast a linear specialization of the predictive approach with the state representations used in POMDPs and in k-order Markov models. Ours is the first specific formulation of the predictive idea that includes both stochasticity and actions (controls). We show that any system has a linear predictive state representation with number of predictions no greater than the number of states in its minimal POMDP model.

pomdp, representation, vector, (16 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Colorado (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Greensmith, Evan, Bartlett, Peter L., Baxter, Jonathan

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

We consider the use of two additive control variate methods to reduce the variance of performance gradient estimates in reinforcement learning problems. The first approach we consider is the baseline method, in which a function of the current state is added to the discounted value estimate. We relate the performance of these methods, which use sample paths, to the variance of estimates based on iid data. We derive the baseline function that minimizes this variance, and we show that the variance for any baseline is the sum of the optimal variance and a weighted squared distance to the optimal baseline. We show that the widely used average discounted value baseline (where the reward is replaced by the difference between the reward and its expectation) is suboptimal.

baseline, value function, variance, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Reinforcement Learning with Long Short-Term Memory

Bakker, Bram

This paper presents reinforcement learning with a Long Short Term Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies between relevant events. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task. 1 Introduction Reinforcement learning (RL) is a way of learning how to behave based on delayed reward signals [12]. Among the more important challenges for RL are tasks where part of the state of the environment is hidden from the agent. Such tasks are called non-Markovian tasks or Partially Observable Markov Decision Processes. Many real world tasks have this problem of hidden state. For instance, in a navigation task different positions in the environment may look the same, but one and the same action may lead to different next states or rewards. Thus, hidden state makes RL more realistic.

agent, dependency, information, (16 more...)

Country:

Europe > Netherlands > South Holland > Leiden (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

A Bayesian Network for Real-Time Musical Accompaniment

Raphael, Christopher

We describe a computer system that provides a real-time musical accompaniment for a live soloist in a piece of non-improvised music for soloist and accompaniment. A Bayesian network is developed that represents the joint distribution on the times at which the solo and accompaniment notes are played, relating the two parts through a layer of hidden variables. The network is first constructed using the rhythmic information contained in the musical score. The network is then trained to capture the musical interpretations of the soloist and accompanist in an off-line rehearsal phase. During live accompaniment the learned distribution of the network is combined with a real-time analysis of the soloist's acoustic signal, performed with a hidden Markov model, to generate a musically principled accompaniment that respects all available sources of knowledge. A live demonstration will be provided.

accompaniment, interpretation, soloist, (13 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Sequential Noise Compensation by Sequential Monte Carlo Method

Yao, K., Nakamura, S.

We present a sequential Monte Carlo method applied to additive noise compensation for robust speech recognition in time-varying noise. The method generates a set of samples according to the prior distribution given by clean speech models and noise prior evolved from previous estimation. An explicit model representing noise effects on speech features is used, so that an extended Kalman filter is constructed for each sample, generating the updated continuous state estimate as the estimation of the noise parameter, and prediction likelihood for weighting each sample. Minimum mean square error (MMSE) inference of the time-varying noise parameter is carried out over these samples by fusion the estimation of samples according to their weights. A residual resampling selection step and a Metropolis-Hastings smoothing step are used to improve calculation efficiency. Experiments were conducted on speech recognition in simulated non-stationary noises, where noise power changed artificially, and highly non-stationary Machinegun noise. In all the experiments carried out, we observed that the method can have significant recognition performance improvement, over that achieved by noise compensation with stationary noise assumption.

noise, noise compensation, recognition, (13 more...)

Country: Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Speech Recognition using SVMs

Smith, N., Gales, Mark

An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihood-ratio. The score-space defined by this mapping avoids some limitations of the Fisher score. Class-conditional generative models are directly incorporated into the definition of the score-space.

classifier, generative model, kernel, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Utah (0.04)

Technology:

Information Technology > Artificial Intelligence > Speech (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Speech Recognition with Missing Data using Recurrent Neural Nets

Parveen, S., Green, P.

In the'missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regions which are dominated by the speech source. The remaining regions are considered to be'missing'. In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classification with incomplete data and imputing missing values interact. We report encouraging results on an isolated digit recognition task.

imputation, recognition, speech recognition, (13 more...)

Country:

Asia > Middle East > Jordan (0.06)
Asia > China > Beijing > Beijing (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Hershey, John R., Casey, Michael

Audio-Visual Sound Separation Via Hidden Markov Models

It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests the utility of audiovisual information for the task of speech enhancement. We propose a method to exploit audiovisual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori ally combined, to incorporate visual lip information and employ novel signal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosion in the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audiovisual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information.

enhancement, information, speech, (16 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Industry: Automobiles & Trucks (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Brown, Andrew D., Hinton, Geoffrey E.

Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Logistic units in the first hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's.

hmm, probability, sequence, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

Torkkola, Kari

The marriage of Renyi entropy with Parzen density estimation has been shown to be a viable tool in learning discriminative feature transforms. However, it suffers from computational complexity proportional to the square of the number of samples in the training data. This sets a practical limit to using large databases. We suggest immediate divorce of the two methods and remarriage of Renyi entropy with a semi-parametric density estimation method, such as a Gaussian Mixture Models (GMM). This allows all of the computation to take place in the low dimensional target space, and it reduces computational complexity proportional to square of the number of components in the mixtures. Furthermore, a convenient extension to Hidden Markov Models as commonly used in speech recognition becomes possible.

feature transform, mutual information, output space, (14 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.57)