Country
A Segment-Based Automatic Language Identification System
Muthusamy, Yeshwant K., Cole, Ronald A.
Automatic language identification is the rapid automatic determination of the language being spoken, by any speaker, saying anything. Despite several important applications of automatic language identification, this area has suffered from a lack of basic research and the absence of a standardized, public-domain database of languages. It is well known that languages have characteristic sound patterns. Languages have been described subjectively as "singsong", "rhythmic", "guttural", "nasal" etc. The key to solving the problem of automatic language identification is the detection and exploitation of such differences between languages. We assume that each language in the world has a unique acoustic structure, and that this structure can be defined in terms of phonetic and prosodic features of speech.
Propagation Filters in PDS Networks for Sequencing and Ambiguity Resolution
Sumida, Ronald A., Dyer, Michael G.
We present a Parallel Distributed Semantic (PDS) Network architecture that addresses the problems of sequencing and ambiguity resolution in natural language understanding. A PDS Network stores phrases and their meanings using multiple PDP networks, structured in the form of a semantic net. A mechanism called Propagation Filters is employed: (1) to control communication between networks, (2) to properly sequence the components of a phrase, and (3) to resolve ambiguities. Simulation results indicate that PDS Networks and Propagation Filters can successfully represent high-level knowledge, can be trained relatively quickly, and provide for parallel inferencing at the knowledge level. 1 INTRODUCTION Backpropagation has shown considerable potential for addressing problems in natural language processing (NLP). However, the traditional PDP [Rumelhart and McClelland, 1986] approach of using one (or a small number) of backprop networks for NLP has been plagued by a number of problems: (1) it has been largely unsuccessful at representing high-level knowledge, (2) the networks are slow to train, and (3) they are sequential at the knowledge level.
A Connectionist Learning Approach to Analyzing Linguistic Stress
Gupta, Prahlad, Touretzky, David S.
We use connectionist modeling to develop an analysis of stress systems in terms of ease of learnability. In traditional linguistic analyses, learnability arguments determine default parameter settings based on the feasibilty of logicall y deducing correct settings from an initial state. Our approach provides an empirical alternative to such arguments. Based on perceptron learning experiments using data from nineteen human languages, we develop a novel characterization of stress patterns in terms of six parameters. These provide both a partial description of the stress pattern itself and a prediction of its learnability, without invoking abstract theoretical constructs such as metrical feet. This work demonstrates that machine learning methods can provide a fresh approach to understanding linguistic phenomena.
Constructing Proofs in Symmetric Networks
This paper considers the problem of expressing predicate calculus in connectionist networks that are based on energy minimization. Given a firstorder-logic knowledge base and a bound k, a symmetric network is constructed (like a Boltzman machine or a Hopfield network) that searches for a proof for a given query. If a resolution-based proof of length no longer than k exists, then the global minima of the energy function that is associated with the network represent such proofs. The network that is generated is of size cubic in the bound k and linear in the knowledge size. There are no restrictions on the type of logic formulas that can be represented.
Generalization Performance in PARSEC - A Structured Connectionist Parsing Architecture
This paper presents PARSECa system for generating connectionist parsing networks from example parses. PARSEC is not based on formal grammar systems and is geared toward spoken language tasks. PARSEC networks exhibit three strengths important for application to speech processing: 1) they learn to parse, and generalize well compared to handcoded grammars; 2) they tolerate several types of noise; 3) they can learn to use multi-modal input. Presented are the PARSEC architecture and performance analyses along several dimensions that demonstrate PARSEC's features. PARSEC's performance is compared to that of traditional grammar-based parsing systems.
Forward Dynamics Modeling of Speech Motor Control Using Physiological Data
Hirayama, Makoto, Vatikiotis-Bateson, Eric, Kawato, Mitsuo, Jordan, Michael I.
We propose a paradigm for modeling speech production based on neural networks. We focus on characteristics of the musculoskeletal system. Using real physiological data - articulator movements and EMG from muscle activitya neural network learns the forward dynamics relating motor commands to muscles and the ensuing articulator behavior. After learning, simulated perturbations, were used to asses properties of the acquired model, such as natural frequency, damping, and interarticulator couplings. Finally, a cascade neural network is used to generate continuous motor commands from a sequence of discrete articulatory targets.
JANUS: Speech-to-Speech Translation Using Connectionist and Non-Connectionist Techniques
Waibel, Alex, Jain, Ajay N., McNair, Arthur E., Tebelskis, Joe, Osterholtz, Louise, Saito, Hiroaki, Schmidbauer, Otto, Sloboda, Tilo, Woszczyna, Monika
JANUS translates continuously spoken English and German into German, English, and Japanese. JANUS currently achieves 87% translation fidelity from English speech and 97% from German speech. We present the JANUS system along with comparative evaluations of its interchangeable processing components, with special emphasis on the connectionist modules.
Neural Network - Gaussian Mixture Hybrid for Speech Recognition or Density Estimation
Bengio, Yoshua, Mori, Renato De, Flammia, Giovanni, Kompe, Ralf
The subject of this paper is the integration of multi-layered Artificial Neural Networks (ANN) with probability density functions such as Gaussian mixtures found in continuous density Hidden Markov Models (HMM). In the first part of this paper we present an ANN/HMM hybrid in which all the parameters of the the system are simultaneously optimized with respect to a single criterion. In the second part of this paper, we study the relationship between the density of the inputs of the network and the density of the outputs of the networks. A few experiments are presented to explore how to perform density estimation with ANNs. 1 INTRODUCTION This paper studies the integration of Artificial Neural Networks (ANN) with probability density functions (pdf) such as the Gaussian mixtures often used in continuous density Hidden Markov Models. The ANNs considered here are multi-layered or recurrent networks with hyperbolic tangent hidden units.
Connectionist Optimisation of Tied Mixture Hidden Markov Models
Renals, Steve, Morgan, Nelson, Bourlard, Hervรฉ, Franco, Horacio, Cohen, Michael
Issues relating to the estimation of hidden Markov model (HMM) local probabilities are discussed. In particular we note the isomorphism of radial basis functions (RBF) networks to tied mixture density modellingj additionally we highlight the differences between these methods arising from the different training criteria employed. We present a method in which connectionist training can be modified to resolve these differences and discuss some preliminary experiments. Finally, we discuss some outstanding problems with discriminative training.
Improved Hidden Markov Model Speech Recognition Using Radial Basis Function Networks
Singer, Elliot, Lippmann, Richard P.
The RBF network consists of an input layer, a hidden layer composed of Gaussian basis functions, and an output layer. Connections from the input layer to the hidden layer are fixed at unity while those from the hidden layer to the output layer are trained by minimizing the overall mean-square error between actual and desired output values. Each RBF output node has a corresponding state in a set of HMM word models which represent the words in the vocabulary. HMM word models are left-to-right with no skip states and have a one-state background noise model at either end. The background noise models are identical for all words.