AITopics

We present an MS-TDNN for recognizing continuously spelled letters, a task characterized by a small but highly confusable vocabulary.

boundary, neural network, recognition, (13 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada > Ontario > Toronto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

A Hybrid Neural Net System for State-of-the-Art Continuous Speech Recognition

Zavaliagkos, G., Zhao, Y., Schwartz, R., Makhoul, J.

Untill recently, state-of-the-art, large-vocabulary, continuous speech recognition (CSR) has employed Hidden Markov Modeling (HMM) to model speech sounds. In an attempt to improve over HMM we developed a hybrid system that integrates HMM technology with neural networks. We present the concept of a "Segmental Neural Net" (SNN) for phonetic modeling in CSR. By taking into account all the frames of a phonetic segment simultaneously, the SNN overcomes the well-known conditional-independence limitation of HMMs. In several speaker-independent experiments with the DARPA Resource Management corpus, the hybrid system showed a consistent improvement in performance over the baseline HMM system. 1 INTRODUCTION The current state of the art in continuous speech recognition (CSR) is based on the use of hidden Markov models (HMM) to model phonemes in context.

error rate, hybrid neural net system, hybrid system, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Tebelskis, Joe, Waibel, Alex

Performance Through Consistency: MS-TDNN's for Large Vocabulary Continuous Speech Recognition

Connectionist Rpeech recognition systems are often handicapped by an inconsistency between training and testing criteria. This problem is addressed by the Multi-State Time Delay Neural Network (MS-TDNN), a hierarchical phonf'mp and word classifier which uses DTW to modulate its connectivit.y

level training, recognition, speech recognition, (13 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.58)

Príncipe, José Carlos, Zahalka, Abir

Transient Signal Detection with Neural Networks: The Search for the Desired Signal

Matched filtering has been one of the most powerful techniques employed for transient detection. Here we will show that a dynamic neural network outperforms the conventional approach. When the artificial neural network (ANN) is trained with supervised learning schemes there is a need to supply the desired signal for all time, although we are only interested in detecting the transient. In this paper we also show the effects on the detection agreement of different strategies to construct the desired signal. The extension of the Bayes decision rule (011 desired signal), optimal in static classification, performs worse than desired signals constructed by random noise or prediction during the background.

background, neural network, spike, (13 more...)

Country:

North America > United States > New York (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Florida > Alachua County > Gainesville (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Modeling Consistency in a Speaker Independent Continuous Speech Recognition System

Konig, Yochai, Morgan, Nelson, Wooters, Chuck, Abrash, Victor, Cohen, Michael, Franco, Horacio

We would like to incorporate speaker-dependent consistencies, such as gender, in an otherwise speaker-independent speech recognition system. In this paper we discuss a Gender Dependent Neural Network (GDNN) which can be tuned for each gender, while sharing most of the speaker independent parameters. We use a classification network to help generate gender-dependent phonetic probabilities for a statistical (HMM) recognition system. The gender classification net predicts the gender with high accuracy, 98.3% on a Resource Management test set. However, the integration of the GDNN into our hybrid HMM-neural network recognizer provided an improvement in the recognition score that is not statistically significant on a Resource Management test set.

architecture, gender, probability, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Maryland > Baltimore (0.04)
(4 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.32)

Lee, Wei-Tsih, Pearson, John

A Hybrid Linear/Nonlinear Approach to Channel Equalization Problems

Channel equalization problem is an important problem in high-speed communications. The sequences of symbols transmitted are distorted by neighboring symbols. Traditionally, the channel equalization problem is considered as a channel-inversion operation. One problem of this approach is that there is no direct correspondence between error probability and residual error produced by the channel inversion operation. In this paper, the optimal equalizer design is formulated as a classification problem. The optimal classifier can be constructed by Bayes decision rule. In general it is nonlinear. An efficient hybrid linear/nonlinear equalizer approach has been proposed to train the equalizer. The error probability of new linear/nonlinear equalizer has been shown to be better than a linear equalizer in an experimental channel. 1 INTRODUCTION

equalizer, linear equalizer, rb network, (14 more...)

Country:

North America > United States > New York (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Liu, Weimin, Andreou, Andreas G., Jr., Moise H. Goldstein

Analog Cochlear Model for Multiresolution Speech Analysis

The tradeoff between time and frequency resolution is viewed as the fundamental difference between conventional spectrographic analysis and cochlear signal processing for broadband, rapid-changing signals. The model's response exhibits a wavelet-like analysis in the scale domain that preserves good temporal resolution; the frequency of each spectral component in a broadband signal can be accurately determined from the interpeak intervals in the instantaneous firing rates of auditory fibers. Such properties of the cochlear model are demonstrated with natural speech and synthetic complex signals. 1 Introduction As a nonparametric tool, spectrogram, or short-term Fourier transform, is widely used in analyzing non-stationary signals, such speech. Usually a window is applied to the running signal and then the Fourier transform is performed. The specific window applied determines the tradeoff between temporal and spectral resolutions of the analysis, as indicated by the uncertainty principle [1].

analog cochlear model, frequency, resolution, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Reading (0.04)
North America > United States > Maryland > Montgomery County > Germantown (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California (0.04)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Artificial Intelligence (0.95)
Information Technology > Data Science > Data Quality > Data Transformation (0.56)

Hirayama, Makoto, Vatikiotis-Bateson, Eric, Honda, Kiyoshi, Koike, Yasuharu, Kawato, Mitsuo

Physiologically Based Speech Synthesis

This study demonstrates a paradigm for modeling speech production based on neural networks. Using physiological data from speech utterances, a neural network learns the forward dynamics relating motor commands to muscles and the ensuing articulator behavior that allows articulator trajectories to be generated from motor commands constrained by phoneme input strings and global performance parameters. From these movement trajectories, a second neural network generates PARCOR parameters that are then used to synthesize the speech acoustics.

speech, trajectory, utterance, (15 more...)

Country:

Asia > Middle East > Jordan (0.07)
North America > United States > Utah (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Industry: Health & Medicine (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Cohen, Michael, Franco, Horacio, Morgan, Nelson, Rumelhart, David E., Abrash, Victor

Context-Dependent Multiple Distribution Phonetic Modeling with MLPs

A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM:) speech recognition systems have been developed in recent years (Morgan and Bourlard.

mlp, output layer, probability, (12 more...)

Country:

North America > United States > New Mexico (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Some Estimates of Necessary Number of Connections and Hidden Units for Feed-Forward Networks

Kowalczyk, Adam

The feed-forward networks with fixed hidden units (FllU-networks) are compared against the category of remaining feed-forward networks with variable hidden units (VHU-networks).

dichotomy, general position, synaptic weight, (17 more...)

Country:

Oceania > Australia (0.05)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)