AITopics

In this paper, we discuss regularisation in online/sequential learning algorithms.In environments where data arrives sequentially, techniques such as cross-validation to achieve regularisation or model selection are not possible. Further, bootstrapping to determine aconfidence level is not practical. To surmount these problems, a minimum variance estimation approach that makes use of the extended Kalman algorithm for training multi-layer perceptrons isemployed. The novel contribution of this paper is to show the theoretical links between extended Kalman filtering, Sutton's variable learning rate algorithms and Mackay's Bayesian estimation framework.In doing so, we propose algorithms to overcome the need for heuristic choices of the initial conditions and noise covariance matrices in the Kalman approach.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Country:

Europe > United Kingdom > England (0.16)
Africa (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Bengio, Yoshua, Bengio, Samy, Isabelle, Jean-Franc, Singer, Yoram

Shared Context Probabilistic Transducers

Recently, a model for supervised learning of probabilistic transducers representedby suffix trees was introduced. However, this algorithm tendsto build very large trees, requiring very large amounts of computer memory. In this paper, we propose anew, more compact, transducermodel in which one shares the parameters of distributions associatedto contexts yielding similar conditional output distributions. We illustrate the advantages of the proposed algorithm withcomparative experiments on inducing a noun phrase recogmzer.

machine learning, natural language, node, (16 more...)

Country:

North America > United States (0.29)
North America > Canada > Quebec (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Barber, David, Schottky, Bernhard

Radial Basis Functions: A Bayesian Treatment

Bayesian methods have been successfully applied to regression and classification problems in multi-layer perceptrons. We present a novel application of Bayesian techniques to Radial Basis Function networks by developing a Gaussian approximation to the posterior distribution which, for fixed basis function widths, is analytic in the parameters. The setting of regularization constants by crossvalidation iswasteful as only a single optimal parameter estimate is retained. We treat this issue by assigning prior distributions to these constants, which are then adapted in light of the data under a simple re-estimation formula. 1 Introduction Radial Basis Function networks are popular regression and classification tools[lO]. For fixed basis function centers, RBFs are linear in their parameters and can therefore betrained with simple one shot linear algebra techniques[lO]. The use of unsupervised techniques to fix the basis function centers is, however, not generally optimal since setting the basis function centers using density estimation on the input data alone takes no account of the target values associated with that data. Ideally, therefore, we should include the target values in the training procedure[7, 3, 9]. Unfortunately, allowingcenters to adapt to the training targets leads to the RBF being a nonlinear function of its parameters, and training becomes more problematic. Most methods that perform supervised training of RBF parameters minimize the ·Present address: SNN, University of Nijmegen, Geert Grooteplein 21, Nijmegen, The Netherlands.

artificial intelligence, machine learning, regularization constant, (16 more...)

Country: Europe > Netherlands > Gelderland > Nijmegen (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Barber, David, Bishop, Christopher M.

Ensemble Learning for Multi-Layer Networks

In contrast to the maximum likelihood approach which finds only a single estimate for the regression parameters, the Bayesian approach yields a distribution of weight parameters, p(wID), conditional on the training data D, and predictions are ex- ·Present address: SNN, University of Nijmegen, Geert Grooteplein 21, Nijmegen, The Netherlands.

artificial intelligence, covariance matrix, machine learning, (15 more...)

Country: Europe > Netherlands > Gelderland > Nijmegen (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Kappen, Hilbert J., Ortiz, Francisco de Borja Rodríguez

Boltzmann Machine Learning Using Mean Field Theory and Linear Response Correction

We present a new approximate learning algorithm for Boltzmann Machines, using a systematic expansion of the Gibbs free energy to second order in the weights. The linear response correction to the correlations is given by the Hessian of the Gibbs free energy. The computational complexity of the algorithm is cubic in the number of neurons. We compare the performance of the exact BM learning algorithm with first order (Weiss) mean field theory and second order (TAP) mean field theory. The learning task consists of a fully connected Ising spin glass model on 10 neurons. We conclude that 1) the method works well for paramagnetic problems 2) the TAP correction gives a significant improvement over the Weiss mean field theory, both for paramagnetic and spin glass problems and 3) that the inclusion of diagonal weights improves the Weiss approximation for paramagnetic problems, but not for spin glass problems.

approximation, artificial intelligence, machine learning, (13 more...)

Country:

Europe > Spain (0.15)
Europe > Netherlands (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)

Sahani, Maneesh, Pezaris, John S., Andersen, Richard A.

On the Separation of Signals from Neighboring Cells in Tetrode Recordings

We discuss a solution to the problem of separating waveforms produced bymultiple cells in an extracellular neural recording. We take an explicitly probabilistic approach, using latent-variable models ofvarying sophistication to describe the distribution of waveforms producedby a single cell. The models range from a single Gaussian distribution of waveforms for each cell to a mixture of hidden Markov models. We stress the overall statistical structure of the approach, allowing the details of the generative model chosen to depend on the specific neural preparation.

artificial intelligence, machine learning, waveform, (18 more...)

Country: North America > United States > California (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Schenkel, Markus, Latimer, Cyril, Jabri, Marwan A.

Comparison of Human and Machine Word Recognition

We present a study which is concerned with word recognition rates for heavily degraded documents. We compare human with machine reading capabilitiesin a series of experiments, which explores the interaction of word/non-word recognition, word frequency and legality of non-words with degradation level. We also study the influence of character segmentation, andcompare human performance with that of our artificial neural network model for reading. We found that the proposed computer model uses word context as efficiently as humans, but performs slightly worse on the pure character recognition task. 1 Introduction Optical Character Recognition (OCR) of machine-print document images ·has matured considerably during the last decade. Recognition rates as high as 99.5% have been reported ongood quality documents. However, for lower image resolutions (200 Dpl and below), noisy images, images with blur or skew, the recognition rate declines considerably. Inbad quality documents, character segmentation is as big a problem as the actual character recognition.

artificial intelligence, error rate, machine learning, (16 more...)

Country: Oceania > Australia (0.15)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Review of Artificial Intelligence and Mobile Robotics

Brown, Chris

AI MagazineDec-15-1998

Three main sections follow: mainly on Rodney Brooks's subsumption

artificial intelligence, machine learning, robot, (14 more...)

AI Magazine

Country:

North America > United States > Massachusetts (0.29)
North America > United States > California (0.29)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

The DARPA High-Performance Knowledge Bases Project

Cohen, Paul R., Schrag, Robert, Jones, Eric, Pease, Adam, Lin, Albert, Starr, Barbara, Gunning, David, Burke, Murray

AI MagazineDec-15-1998

Now completing its first year, the High-Performance Knowledge Bases Project promotes technology for developing very large, flexible, and reusable knowledge bases. The project is supported by the Defense Advanced Research Projects Agency and includes more than 15 contractors in universities, research laboratories, and companies. The evaluation of the constituent technologies centers on two challenge problems, in crisis management and battlespace reasoning, each demanding powerful problem solving with very large knowledge bases. This article discusses the challenge problems, the constituent technologies, and their integration and evaluation.

artificial intelligence, knowledge management, machine learning, (21 more...)

AI Magazine

Country: North America > United States > California (1.00)

Genre: Research Report (0.87)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)