AITopics

An important issue in applying SVMs to speech recognition is the ability to classify variable length sequences. This paper presents extensions to a standard scheme for handling this variable length data, the Fisher score. A more useful mapping is introduced based on the likelihood-ratio. The score-space defined by this mapping avoids some limitations of the Fisher score. Class-conditional generative models are directly incorporated into the definition of the score-space.

classifier, generative model, kernel, (13 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Utah (0.04)

Technology:

Information Technology > Artificial Intelligence > Speech (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Speech Recognition with Missing Data using Recurrent Neural Nets

Parveen, S., Green, P.

In the'missing data' approach to improving the robustness of automatic speech recognition to added noise, an initial process identifies spectraltemporal regions which are dominated by the speech source. The remaining regions are considered to be'missing'. In this paper we develop a connectionist approach to the problem of adapting speech recognition to the missing data case, using Recurrent Neural Networks. In contrast to methods based on Hidden Markov Models, RNNs allow us to make use of long-term time constraints and to make the problems of classification with incomplete data and imputing missing values interact. We report encouraging results on an isolated digit recognition task.

imputation, recognition, speech recognition, (13 more...)

Country:

Asia > Middle East > Jordan (0.06)
Asia > China > Beijing > Beijing (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Meinecke, Frank C., Ziehe, Andreas, Kawanabe, Motoaki, Müller, Klaus-Robert

Estimating the Reliability of ICA Projections

When applying unsupervised learning techniques like ICA or temporal decorrelation, a key question is whether the discovered projections are reliable. In other words: can we give error bars or can we assess the quality of our separation? We use resampling methods to tackle these questions and show experimentally that our proposed variance estimations are strongly correlated to the separation error. We demonstrate that this reliability estimation can be used to choose the appropriate ICA-model, to enhance significantly the separation performance, and, most important, to mark the components that have a actual physical meaning.

algorithm, projection, separation error, (13 more...)

Country:

North America > United States > New York (0.05)
Europe > Germany > Brandenburg > Potsdam (0.05)
Europe > Sweden (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Hershey, John R., Casey, Michael

Audio-Visual Sound Separation Via Hidden Markov Models

It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests the utility of audiovisual information for the task of speech enhancement. We propose a method to exploit audiovisual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori ally combined, to incorporate visual lip information and employ novel signal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosion in the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audiovisual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information.

enhancement, information, speech, (16 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Industry: Automobiles & Trucks (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Frey, Brendan J., Kristjansson, Trausti T., Deng, Li, Acero, Alex

ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition

A challenging, unsolved problem in the speech recognition community is recognizing speech signals that are corrupted by loud, highly nonstationary noise. One approach to noisy speech recognition is to automatically remove the noise from the cepstrum sequence before feeding it in to a clean speech recognizer. In previous work published in Eurospeech, we showed how a probability model trained on clean speech and a separate probability model trained on noise could be combined for the purpose of estimating the noisefree speech from the noisy speech. We showed how an iterative 2nd order vector Taylor series approximation could be used for probabilistic inference in this model. In many circumstances, it is not possible to obtain examples of noise without speech.

algonquin, noise model, speech, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Plymouth County > Norwell (0.05)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

A Sequence Kernel and its Application to Speaker Recognition

Campbell, William M.

A novel approach for comparing sequences of observations using an explicit-expansion kernel is demonstrated. The kernel is derived using the assumption of the independence of the sequence of observations and a mean-squared error training criterion. The use of an explicit expansion kernel reduces classifier model size and computation dramatically, resulting in model sizes and computation one-hundred times smaller in our application. The explicit expansion also preserves the computational advantages of an earlier architecture based on mean-squared error training. Training using standard support vector machine methodology gives accuracy that significantly exceeds the performance of state-of-the-art mean-squared error training for a speaker recognition task.

kernel, polynomial classifier, recognition, (14 more...)

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Speech Recognition (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.60)

Brown, Andrew D., Hinton, Geoffrey E.

Relative Density Nets: A New Way to Combine Backpropagation with HMM's

Logistic units in the first hidden layer of a feedforward neural network compute the relative probability of a data point under two Gaussians. This leads us to consider substituting other density models. We present an architecture for performing discriminative learning of Hidden Markov Models using a network of many small HMM's. Experiments on speech data show it to be superior to the standard method of discriminatively training HMM's.

hmm, probability, sequence, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Intransitive Likelihood-Ratio Classifiers

Bilmes, Jeff, Ji, Gang, Meila, Marina

In this work, we introduce an information-theoretic based correction term to the likelihood ratio classification method for multiple classes. Under certain conditions, the term is sufficient for optimally correcting the difference between the true and estimated likelihood ratio, and we analyze this in the Gaussian case. We find that the new correction term significantly improves the classification results when tested on medium vocabulary speech recognition tasks. Moreover, the addition of this term makes the class comparisons analogous to an intransitive game and we therefore use several tournament-like strategies to deal with this issue. We find that further small improvements are obtained by using an appropriate tournament. Lastly, we find that intransitivity appears to be a good measure of classification confidence.

kl-divergence, likelihood ratio, tournament, (15 more...)

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.47)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Yamasaki, Toshihiko, Shibata, Tadashi

Analog Soft-Pattern-Matching Classifier using Floating-Gate MOS Technology

A flexible pattern-matching analog classifier is presented in conjunction with a robust image representation algorithm called Principal Axes Projection (PAP). In the circuit, the functional form of matching is configurable in terms of the peak position, the peak height and the sharpness of the similarity evaluation. The test chip was fabricated in a 0.6-µm CMOS technology and successfully applied to handwritten pattern recognition and medical radiograph analysis using PAP as a feature extraction pre-processing step for robust image coding. The separation and classification of overlapping patterns is also experimentally demonstrated.

residue vector, template, vector, (15 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.16)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)

Shon, Aaron P., Hsu, David, Diorio, Chris

Learning Spike-Based Correlations and Conditional Probabilities in Silicon

We have designed and fabricated a VLSI synapse that can learn a conditional probability or correlation between spike-based inputs and feedback signals. The synapse is low power, compact, provides nonvolatile weight storage, and can perform simultaneous multiplication and adaptation. We can calibrate arrays of synapses to ensure uniform adaptation characteristics. Finally, adaptation in our synapse does not necessarily depend on the signals used for computation. Consequently, our synapse can implement learning rules that correlate past and present synaptic activity. We provide analysis and experimental chip results demonstrating the operation in learning and calibration mode, and show how to use our synapse to implement various learning rules in silicon.

equilibrium weight, synapse, voltage, (14 more...)

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > San Diego County > San Diego (0.05)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)