AITopics

The human visual system can recognize various 3D (three-dimensional) objects from their 2D (two-dimensional) retinal images although the images vary significantly as the viewpoint changes. Recent computational models have explored how to learn to recognize 3D objects from their projected views (Poggio & Edelman, 1990). Most existing models are, however, based on supervised learning, i.e., during training the teacher tells which object each view belongs to. The model proposed by Weinshall et al. (1990) also requires a signal that segregates different objects during training. This paper, on the other hand, discusses unsupervised aspects of 3D object recognition where the system discovers categories by itself.

i-net, simulation, view direction, (14 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Asia > Middle East > Jordan (0.05)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Back, Andrew D., Tsoi, Ah Chung

A Comparison of Discrete-Time Operator Models for Nonlinear System Identification

We present a unifying view of discrete-time operator models used in the context of finite word length linear signal processing. Comparisons are made between the recently presented gamma operator model, and the delta and rho operator models for performing nonlinear system identification and prediction using neural networks. A new model based on an adaptive bilinear transformation which generalizes all of the above models is presented.

discrete-time operator model, operator, pi operator, (15 more...)

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo

Connectionist Speaker Normalization with Generalized Resource Allocating Networks

The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based) to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically rich sentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, the model included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.

mapping, recognition system, utterance, (11 more...)

Country:

Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Zhao, Ying, Schwartz, Richard M., Sroka, Jason J., Makhoul, John

Hierarchical Mixtures of Experts Methodology Applied to Continuous Speech Recognition

In this paper, we incorporate the Hierarchical Mixtures of Experts (HME) method of probability estimation, developed by Jordan [1], into an HMMbased continuous speech recognition system. The resulting system can be thought of as a continuous-density HMM system, but instead of using gaussian mixtures, the HME system employs a large set of hierarchically organized but relatively small neural networks to perform the probability density estimation. The hierarchical structure is reminiscent of a decision tree except for two important differences: each "expert" or neural net performs a "soft" decision rather than a hard decision, and, unlike ordinary decision trees, the parameters of all the neural nets in the HME are automatically trainable using the EM algorithm. We report results on the ARPA 5,OOO-word and 4O,OOO-word Wall Street Journal corpus using HME models. 1 Introduction Recent research has shown that a continuous-density HMM (CD-HMM) system can outperform a more constrained tied-mixture HMM system for large-vocabulary continuous speech recognition (CSR) when a large amount of training data is available [2]. In other work, the utility of decision trees has been demonstrated in classification problems by using the "divide and conquer" paradigm effectively, where a problem is divided into a hierarchical set of simpler problems.

decision tree, hierarchical mixture, hmm system, (10 more...)

Country:

Asia > Middle East > Jordan (0.25)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Industry: Government > Military (0.36)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Waterhouse, Steve R., Robinson, Anthony J.

Non-linear Prediction of Acoustic Vectors Using Hierarchical Mixtures of Experts

We are concerned in this paper with the application of multiple models, specifically the Hierarchical Mixtures of Experts, to time series prediction, specifically the problem of predicting acoustic vectors for use in speech coding. There have been a number of applications of multiple models in time series prediction. A classic example is the Threshold Autoregressive model (TAR) which was used by Tong & 836 S. R. Waterhouse, A. J. Robinson Lim (1980) to predict sunspot activity. More recently, Lewis, Kay and Stevens (in Weigend & Gershenfeld (1994)) describe the use of Multivariate and Regression Splines (MARS) to the prediction of future values of currency exchange rates. Finally, in speech prediction, Cuperman & Gersho (1985) describe the Switched Inter-frame Vector Prediction (SIVP) method which switches between separate linear predictors trained on different statistical classes of speech.

hierarchical mixture, prediction, variance, (13 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Asia > Middle East > Jordan (0.06)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Lippe, D., Alspector, Joshua

A Study of Parallel Perturbative Gradient Descent

Motivated by difficulties in analog VLSI implementation of back-propagation [Rumelhart et al., 1986] and related algorithms that calculate gradients based on detailed knowledge of the neural network model, there were several similar recent papers proposing to use a parallel [Alspector et al., 1993, Cauwenberghs, 1993, Kirk et al., 1993] or a semi-parallel [Flower and Jabri, 1993] perturbative technique which has the property that it measures (with the physical neural network) rather than calculates the gradient. This technique is closely related to methods of stochastic approximation [Kushner and Clark, 1978] which have been investigated recently by workers in fields other than neural networks.

multiple perturbation, parallel perturbation, perturbation, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.43)

Kazlas, Peter T., Weigend, Andreas S.

Direct Multi-Step Time Series Prediction Using TD(λ)

Abstract Missing

direct multi-step time series prediction, lao

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Data Science > Data Mining (0.40)

Tresp, Volker, Neuneier, Ralph, Ahmad, Subutai

Efficient Methods for Dealing with Missing Data in Supervised Learning

In many applications it is important to know how to react if the available information is incomplete, if sensors fail or if sources of information become A.t the time of the research for this paper, a visiting researcher at the Center for Biological and Computational Learning, MIT.

ahmad and tresp, approximation, tresp, (15 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany (0.05)
Asia > Middle East > Jordan (0.05)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Sperduti, Alessandro, Stork, David G.

A Rapid Graph-based Method for Arbitrary Transformation-Invariant Pattern Classification

We present a graph-based method for rapid, accurate search through prototypes for transformation-invariant pattern classification. Our method has in theory the same recognition accuracy as other recent methods based on ''tangent distance" [Simard et al., 1994], since it uses the same categorization rule. Nevertheless ours is significantly faster during classification because far fewer tangent distances need be computed. Crucial to the success of our system are 1) a novel graph architecture in which transformation constraints and geometric relationships among prototypes are encoded during learning, and 2) an improved graph search criterion, used during classification. These architectural insights are applicable to a wide range of problem domains. Here we demonstrate that on a handwriting recognition task, a basic implementation of our system requires less than half the computation of the Euclidean sorting method. 1 INTRODUCTION In recent years, the crucial issue of incorporating invariances into networks for pattern recognition has received increased attention, most especially due to the work of 666 Alessandro Sperduti, David G. Stork

complexity, prototype, transformation, (14 more...)

Country:

North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Xu, Lei, Jordan, Michael I., Hinton, Geoffrey E.

An Alternative Model for Mixtures of Experts

We propose an alternative model for mixtures of experts which uses a different parametric form for the gating network. The modified model is trained by the EM algorithm. In comparison with earlier models-trained by either EM or gradient ascent-there is no need to select a learning stepsize. We report simulation experiments which show that the new architecture yields faster convergence. We also apply the new model to two problem domains: piecewise nonlinear function approximation and the combination of multiple previously trained classifiers. 1 INTRODUCTION For the mixtures of experts architecture (Jacobs, Jordan, Nowlan & Hinton, 1991), the EM algorithm decouples the learning process in a manner that fits well with the modular structure and yields a considerably improved rate of convergence (Jordan & Jacobs, 1994).

algorithm, expert net, xu and jordan, (15 more...)

Country:

Asia > Middle East > Jordan (0.53)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Mateo County > San Mateo (0.05)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)