AITopics

In this paper we propose recurrent neural networks with feedback into the input units for handling two types of data analysis problems. On the one hand, this scheme can be used for static data when some of the input variables are missing. On the other hand, it can also be used for sequential data, when some of the input variables are missing or are available at different frequencies.

experiment, input variable, recurrent network, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Adaptive Mixture of Probabilistic Transducers

Singer, Yoram

We introduce and analyze a mixture model for supervised learning of probabilistic transducers. We devise an online learning algorithm that efficiently infers the structure and estimates the parameters of each model in the mixture. Theoretical analysis and comparative simulations indicate that the learning algorithm tracks the best model from an arbitrarily large (possibly infinite) pool of models. We also present an application of the model for inducing a noun phrase recognizer.

prediction, probability, suffix tree transducer, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Konig, Yochai, Bourlard, Hervé, Morgan, Nelson

REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition

In this paper, we introduce REMAP, an approach for the training and estimation of posterior probabilities using a recursive algorithm that is reminiscent of the EMbased Forward-Backward (Liporace 1982) algorithm for the estimation of sequence likelihoods. Although very general, the method is developed in the context of a statistical model for transition-based speech recognition using Artificial Neural Networks (ANN) to generate probabilities for Hidden Markov Models (HMMs). In the new approach, we use local conditional posterior probabilities of transitions to estimate global posterior probabilities of word sequences. Although we still use ANNs to estimate posterior probabilities, the network is trained with targets that are themselves estimates of local posterior probabilities. An initial experimental result shows a significant decrease in error-rate in comparison to a baseline system. 1 INTRODUCTION The ultimate goal in speech recognition is to determine the sequence of words that has been uttered.

algorithm, posterior probability, probability, (11 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)
North America > United States > Oregon (0.04)
Europe > Belgium (0.04)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Waterhouse, Steve R., MacKay, David, Robinson, Anthony J.

Bayesian Methods for Mixtures of Experts

ABSTRACT We present a Bayesian framework for inferring the parameters of a mixture of experts model based on ensemble learning by variational free energy minimisation. The Bayesian approach avoids the over-fitting and noise level underestimation problems of traditional maximum likelihood inference. We demonstrate these methods on artificial problems and sunspot time series prediction. INTRODUCTION The task of estimating the parameters of adaptive models such as artificial neural networks using Maximum Likelihood (ML) is well documented ego Geman, Bienenstock & Doursat (1992). ML estimates typically lead to models with high variance, a process known as "over-fitting".

algorithm, bayesian method, hyperparameter, (12 more...)

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Semenov, Serguei A., Shuvalova, Irina B.

Some results on convergent unlearning algorithm

In the past years the unsupervised learning schemes arose strong interest among researchers but for the time being a little is known about underlying learning mechanisms, as well as still less rigorous results like convergence theorems were obtained in this field. One of promising concepts along this line is so called "unlearning" for the Hopfield-type neural networks (Hopfield et ai, 1983, van Hemmen & Klemmer, 1992, Wimbauer et ai, 1994). Elaborating that elegant ideas the convergent unlearning algorithm has recently been proposed (Plakhov & Semenov, 1994), executing without patterns presentation. It is aimed at to correct initial Hebbian connectivity in order to provide extensive storage of arbitrary correlated data. This algorithm is stated as follows. Pick up at iteration step m, m 0,1,2,... a random network state s(m)

convergence theorem, plakhov & semenov, probability, (12 more...)

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > France (0.04)
Asia > Russia (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.55)

Marchand, Mario, Hadjifaradji, Saeed

Strong Unimodality and Exact Learning of Constant Depth µ-Perceptron Networks

strong unimodality and exact learning

Abstract Missing

Country:

Europe > United Kingdom (0.04)
Asia > Middle East > UAE (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.40)

Mansour, Yishay, Sahar, Sigal

Implementation Issues in the Fourier Transform Algorithm

Over the last few years the Fourier Transform (FT) representation of boolean functions has been an instrumental tool in the computational learning theory community. It has been used mainly to demonstrate the learnability of various classes of functions with respect to the uniform distribution.

algorithm, coefficient, hypothesis, (16 more...)

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.69)
Information Technology > Data Science > Data Quality > Data Transformation (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.55)

Kadirkamanathan, Visakan, Kadirkamanathan, Maha

Recursive Estimation of Dynamic Modular RBF Networks

In this paper, recursive estimation algorithms for dynamic modular networks are developed. The models are based on Gaussian RBF networks and the gating network is considered in two stages: At first, it is simply a time-varying scalar and in the second, it is based on the state, as in the mixture of local experts scheme. The resulting algorithm uses Kalman filter estimation for the model estimation and the gating probability estimation. Both, 'hard' and'soft' competition based estimation schemes are developed where in the former, the most probable network is adapted and in the latter all networks are adapted by appropriate weighting of the data. 1 INTRODUCTION The problem of learning multiple modes in a complex nonlinear system is increasingly being studied by various researchers [2, 3, 4, 5, 6], The use of a mixture of local experts [5, 6], and a conditional mixture density network [3] have been developed to model various modes of a system. The development has mainly been on model estimation from a given set of block data, with the model likelihood dependent on the input to the networks.

algorithm, estimation, probability, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

A Realizable Learning Task which Exhibits Overfitting

Bös, Siegfried

In this paper we examine a perceptron learning task. The task is realizable since it is provided by another perceptron with identical architecture. Both perceptrons have nonlinear sigmoid output functions. The gain of the output function determines the level of nonlinearity of the learning task. It is observed that a high level of nonlinearity leads to overfitting. We give an explanation for this rather surprising observation and develop a method to avoid the overfitting. This method has two possible interpretations, one is learning with noise, the other cross-validated early stopping.

order parameter, perceptron, training process, (14 more...)

Country: Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.95)

Amari, Shun-ichi, Murata, Noboru, Müller, Klaus-Robert, Finke, Michael, Yang, Howard Hua

Statistical Theory of Overtraining - Is Cross-Validation Asymptotically Effective?

A statistical theory for overtraining is proposed. The analysis treats realizable stochastic neural networks, trained with Kullback Leibler loss in the asymptotic case. It is shown that the asymptotic gain in the generalization error is small if we perform early stopping, even if we have access to the optimal stopping time. Considering cross-validation stopping we answer the question: In what ratio the examples should be divided into training and testing sets in order to obtain the optimum performance. In the non-asymptotic region cross-validated early stopping always decreases the generalization error. Our large scale simulations done on a CM5 are in nice agreement with our analytical findings.

early stopping, generalization error, stopping, (17 more...)

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.05)
North America > United States > Illinois > Champaign County > Urbana (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.65)