AITopics

A method for incorporating context-dependent phone classes in a connectionist-HMM hybrid speech recognition system is introduced. A modular approach is adopted, where single-layer networks discriminate between different context classes given the phone class and the acoustic data. The context networks are combined with a context-independent (CI) network to generate context-dependent (CD) phone probability estimates. Experiments show an average reduction in word error rate of 16% and 13% from the CI system on ARPA 5,000 word and SQALE 20,000 word tasks respectively. Due to improved modelling, the decoding speed of the CD system is more than twice as fast as the CI system.

context class, hochberg, training data, (9 more...)

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Industry: Government > Military (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Onset-based Sound Segmentation

Smith, Leslie S.

A technique for segmenting sounds using processing based on mammalian early auditory processing is presented. The technique is based on features in sound which neuron spike recording suggests are detected in the cochlear nucleus. The sound signal is bandpassed and each signal processed to enhance onsets and offsets. The onset and offset signals are compressed, then clustered both in time and across frequency channels using a network of integrateand-fire neurons. Onsets and offsets are signalled by spikes, and the timing of these spikes used to segment the sound. 1 Background Traditional speech interpretation techniques based on Fourier transforms, spectrum recoding, and a hidden Markov model or neural network interpretation stage have limitations both in continuous speech and in interpreting speech in the presence of noise, and this has led to interest in front ends modelling biological auditory systems for speech interpretation systems (Ainsworth and Meyer 92; Cosi 93; Cole et al 95).

cochlear nucleus, neuron, segmentation, (13 more...)

Country:

North America > United States (0.04)
North America > Canada (0.04)
Europe > United Kingdom > Scotland (0.04)
(2 more...)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Indiveri, Giacomo, Kramer, Jörg, Koch, Christof

Parallel analog VLSI architectures for computation of heading direction and time-to-contact

To exploit their properties at a system level, we developed parallel image processing architectures for applications that rely mostly on the qualitative properties of the optical flow, rather than on the precise values of the velocity vectors. Specifically, we designed two parallel architectures that employ arrays of elementary motion sensors for the computation of heading direction and time-to-contact. The application domain that we took into consideration for the implementation of such architectures, is the promising one of vehicle navigation. Having defined the types of images to be analyzed and the types of processing to perform, we were able to use a priori infor- VLSI Architectures for Computation of Heading Direction and Time-to-contact 721 mation to integrate selectively the sparse data obtained from the velocity sensors and determine the qualitative properties of the optical flow field of interest.

architecture, optical flow field, velocity sensor, (11 more...)

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Industry: Semiconductors & Electronics (0.75)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.87)
Information Technology > Artificial Intelligence > Vision (0.60)

Jackson, Jeffrey C., Craven, Mark

Learning Sparse Perceptrons

We introduce a new algorithm designed to learn sparse perceptrons over input representations which include high-order features. Our algorithm, which is based on a hypothesis-boosting method, is able to PAClearn a relatively natural class of target concepts. Moreover, the algorithm appears to work well in practice: on a set of three problem domains, the algorithm produces classifiers that utilize small numbers of features yet exhibit good generalization performance. Perhaps most importantly, our algorithm generates concept descriptions that are easy for humans to understand. However, in many applications, such as those that may involve scientific discovery, it is crucial to be able to explain predictions.

algorithm, hypothesis, perceptron, (16 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Using Unlabeled Data for Supervised Learning

Towell, Geoffrey G.

For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats. One response to the expense of class labels is to squeeze the most information possible out of each labeled example. Regularization and cross-validation both have this goal. A second response is to start with a small set of labeled examples and request labels of only those currently unlabeled examples that are expected to provide a significant improvement in the behavior of the classifier (Lewis & Catlett, 1994; Freund et al., 1993). A third response is to tap into a largely ignored potential source of information; namely, unlabeled examples. This response is supported by the theoretical work of Castelli and Cover (1995) which suggests that unlabeled examples have value in learning classification problems.

information, sulu, unlabeled example, (15 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.42)

Is Learning The n-th Thing Any Easier Than Learning The First?

Thrun, Sebastian

This paper investigates learning in a lifelong context. Lifelong learning addresses situations in which a learner faces a whole stream of learning tasks. Such scenarios provide the opportunity to transfer knowledge across multiple learning tasks, in order to generalize more accurately from less training data. In this paper, several different approaches to lifelong learning are described, and applied in an object recognition domain. It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks.

knowledge, neural network, representation, (15 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Israel (0.05)
(4 more...)

Genre:

Overview (0.74)
Research Report (0.54)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.77)

Schraudolph, Nicol N., Sejnowski, Terrence J.

Tempering Backpropagation Networks: Not All Weights are Created Equal

Backpropagation learning algorithms typically collapse the network's structure into a single vector of weight parameters to be optimized. We suggest that their performance may be improved by utilizing the structural information instead of discarding it, and introduce a framework for ''tempering'' each weight accordingly. In the tempering model, activation and error signals are treated as approximately independent random variables. The characteristic scale of weight changes is then matched to that ofthe residuals, allowing structural properties such as a node's fan-in and fan-out to affect the local learning rate and backpropagated error. The model also permits calculation of an upper bound on the global learning rate for batch updates, which in turn leads to different update rules for bias vs. non-bias weights. This approach yields hitherto unparalleled performance on the family relations benchmark, a deep multi-layer network: for both batch learning with momentum and the delta-bar-delta algorithm, convergence at the optimal learning rate is sped up by more than an order of magnitude.

global learning rate, learning rate, tempering backpropagation network, (14 more...)

Country:

North America > United States > Colorado > Denver County > Denver (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(3 more...)

Industry: Education (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.65)

Opitz, David W., Shavlik, Jude W.

Generating Accurate and Diverse Members of a Neural-Network Ensemble

In particular, combining separately trained neural networks (commonly referred to as a neural-network ensemble) has been demonstrated to be particularly successful (Alpaydin, 1993; Drucker et al., 1994; Hansen and Salamon, 1990; Hashem et al., 1994; Krogh and Vedelsby, 1995; Maclin and Shavlik, 1995; Perrone, 1992). Both theoretical (Hansen and Salamon, 1990; Krogh and Vedelsby, 1995) and empirical (Hashem et al., 1994; 536 D. W. OPITZ, J. W. SHA VLIK Maclin and Shavlik, 1995) work has shown that a good ensemble is one where the individual networks are both accurate and make their errors on different parts of the input space; however, most previous work has either focussed on combining the output of multiple trained networks or only indirectly addressed how we should generate a good set of networks.

algorithm, ensemble, neural network, (13 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.05)
North America > United States > Minnesota > St. Louis County > Duluth (0.04)
(6 more...)

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Ghahramani, Zoubin, Jordan, Michael I.

Factorial Hidden Markov Models

Due to the simplicity and efficiency of its parameter estimation algorithm, the hidden Markov model (HMM) has emerged as one of the basic statistical tools for modeling discrete time series, finding widespread application in the areas of speech recognition (Rabiner and Juang, 1986) and computational molecular biology (Baldi et al., 1994). An HMM is essentially a mixture model, encoding information about the history of a time series in the value of a single multinomial variable (the hidden state). This multinomial assumption allows an efficient parameter estimation algorithm to be derived (the Baum-Welch algorithm). However, it also severely limits the representational capacity of HMMs.

algorithm, markov model, state representation, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.07)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Wu, Lizhong, Moody, John E.

A Smoothing Regularizer for Recurrent Neural Networks

We derive a smoothing regularizer for recurrent network models by requiring robustness in prediction performance to perturbations of the training data. The regularizer can be viewed as a generalization of the first order Tikhonov stabilizer to dynamic models. The closed-form expression of the regularizer covers both time-lagged and simultaneous recurrent nets, with feedforward nets and onelayer linear nets as special cases. We have successfully tested this regularizer in a number of case studies and found that it performs better than standard quadratic weight decay. 1 Introd uction One technique for preventing a neural network from overfitting noisy data is to add a regularizer to the error function being minimized. Regularizers typically smooth the fit to noisy data. Well-established techniques include ridge regression, see (Hoerl & Kennard 1970), and more generally spline smoothing functions or Tikhonov stabilizers that penalize the mth-order squared derivatives of the function being fit, as in (Tikhonov & Arsenin 1977), (Eubank 1988), (Hastie & Tibshirani 1990) and (Wahba 1990). Thes(-ilethods have recently been extended to networks of radial basis functions (Girosi, Jones & Poggio 1995), and several heuristic approaches have been developed for sigmoidal neural networks, for example, quadratic weight decay (Plaut, Nowlan & Hinton 1986), weight elimination (Scalettar & Zee 1988),(Chauvin 1990),(Weigend, Rumelhart & Huberman 1990) and soft weight sharing (Nowlan & Hinton 1992).

neural network, regularizer, weight decay, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > New York (0.05)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(3 more...)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)