AITopics

Exact inference in densely connected Bayesian networks is computationally intractable, and so there is considerable interest in developing effective approximation schemes. One approach which has been adopted is to bound the log likelihood using a mean-field approximating distribution. While this leads to a tractable algorithm, the mean field distribution is assumed to be factorial and hence unimodal. In this paper we demonstrate the feasibility of using a richer class of approximating distributions based on mixtures of mean field distributions. We derive an efficient algorithm for updating the mixture parameters and apply it to the problem of learning in sigmoid belief networks. Our results demonstrate a systematic improvement over simple mean field theory as the number of mixture components is increased.

approximating posterior distribution, hlm, log likelihood, (11 more...)

Country:

Asia > Middle East > Jordan (0.07)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Sollich, Peter, Barber, David

On-line Learning from Finite Training Sets in Nonlinear Networks

Online learning is one of the most common forms of neural network training. We present an analysis of online learning from finite training sets for nonlinear networks (namely, soft-committee machines), advancing the theory to more realistic learning scenarios. Dynamical equations are derived for an appropriate set of order parameters; these are exact in the limiting case of either linear networks or infinite training sets. Preliminary comparisons with simulations suggest that the theory captures some effects of finite training sets, but may not yet account correctly for the presence of local minima.

equation, infinite training, order parameter, (14 more...)

Country: Europe > United Kingdom (0.04)

Genre: Instructional Material > Online (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.94)

Shawe-Taylor, John, Cristianini, Nello

Data-Dependent Structural Risk Minimization for Perceptron Decision Trees

This paper presents a neural-model of pre-attentive visual processing. The model explains why certain displays can be processed very fast, "in parallel", while others require slower, "serial" processing, in subsequent attentional systems. Our approach stems from the observation that the visual environment is overflowing with diverse information, but the biological information-processing systems analyzing it have a limited capacity [1]. This apparent mismatch suggests that data compression should be performed at an early stage of perception, and that via an accompanying process of dimension reduction, only a few essential features of the visual display should be retained. We propose that only parallel displays incorporate global features that enable fast target detection, and hence they can be processed pre-attentively, with all items (target and dis tractors) examined at once.

node, principal axis, similarity, (16 more...)

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States > New York (0.04)
North America > United States > Indiana (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.43)

Rattray, Magnus, Saad, David

Globally Optimal On-line Learning Rules

We present a method for determining the globally optimal online learning rule for a soft committee machine under a statistical mechanics framework. This work complements previous results on locally optimal rules, where only the rate of change in generalization error was considered. We maximize the total reduction in generalization error over the whole learning process and show how the resulting rule can significantly outperform the locally optimal rule. 1 Introduction We consider a learning scenario in which a feed-forward neural network model (the student) emulates an unknown mapping (the teacher), given a set of training examples produced by the teacher. The performance of the student network is typically measured by its generalization error, which is the expected error on an unseen example. The aim of training is to reduce the generalization error by adapting the student network's parameters appropriately. A common form of training is online learning, where training patterns are presented sequentially and independently to the network at each learning step.

algorithm, generalization error, optimal rule, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Instructional Material > Online (0.40)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Vinje, William E., Gallant, Jack L.

Modeling Complex Cells in an Awake Macaque during Natural Image Viewing

Our model consists of a classical energy mechanism whose output is divided by nonclassical gain control and texture contrast mechanisms. We apply this model to review movies, a stimulus sequence that replicates the stimulation a cell receives during free viewing of natural images. Data were collected from three cells using five different review movies, and the model was fit separately to the data from each movie. For the energy mechanism alone we find modest but significant correlations (rE 0.41, 0.43, 0.59, 0.35) between model and data. These correlations are improved somewhat when we allow for suppressive surround effects (rE G 0.42, 0.56, 0.60, 0.37). In one case the inclusion of a delayed suppressive surround dramatically improves the fit to the data by modifying the time course of the model's response.

energy mechanism, mechanism, review movie, (11 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.15)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence (0.48)
Information Technology > Sensing and Signal Processing > Image Processing (0.35)

Vigário, Ricardo, Jousmäki, Veikko, Hämäläinen, Matti, Hari, Riitta, Oja, Erkki

Independent Component Analysis for Identification of Artifacts in Magnetoencephalographic Recordings

We have studied the application of an independent component analysis (ICA) approach to the identification and possible removal of artifacts from a magnetoencephalographic (MEG) recording.

artifact, component analysis, independent component, (14 more...)

Country: Europe > Finland > Uusimaa > Helsinki (0.05)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Mel, Bartlett W., Ruderman, Daniel L., Archie, Kevin A.

Toward a Single-Cell Account for Binocular Disparity Tuning: An Energy Model May Be Hiding in Your Dendrites

Further, the greater the similarity between objects, the stronger is the dependence on object appearance, and the more important twodimensional (2D) image information becomes. These findings, however, do not rule out the use of 3D structural information in recognition, and the degree to which 3D information is used in visual memory is an important issue. Liu, Knill, & Kersten (1995) showed that any model that is restricted to rotations in the image plane of independent 2D templates could not account for human performance in discriminating novel object views. We now present results from models of generalized radial basis functions (GRBF), 2D nearest neighbor matching that allows 2D affine transformations, and a Bayesian statistical estimator that integrates over all possible 2D affine transformations. The performance of the human observers relative to each of the models is better for the novel views than for the familiar template views, suggesting that humans generalize better to novel views from template views. The Bayesian estimator yields the optimal performance with 2D affine transformations and independent 2D templates. Therefore, models of 2D affine matching operations with independent 2D templates are unlikely to account for human recognition performance.

complex cell, novel view, observer, (15 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Dayan, Peter, Long, Theresa

Statistical Models of Conditioning

Conditioning experiments probe the ways that animals make predictions about rewards and punishments and use those predictions to control their behavior. One standard model of conditioning paradigms which involve many conditioned stimuli suggests that individual predictions should be added together. Various key results show that this model fails in some circumstances, and motivate an alternative model, in which there is attentional selection between different available stimuli. The new model is a form of mixture of experts, has a close relationship with some other existing psychological suggestions, and is statistically well-founded.

conditioning, inhibitory conditioning, prediction, (17 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.05)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Houde, John F., Jordan, Michael I.

Adaptation in Speech Motor Control

Human subjects are known to adapt their motor behavior to a shift of the visual field brought about by wearing prism glasses over their eyes. We have studied the analog of this effect in speech. U sing a device that can feed back transformed speech signals in real time, we exposed subjects to alterations of their own speech feedback. We found that speakers learn to adjust their production of a vowel to compensate for feedback alterations that change the vowel's perceived phonetic identity; moreover, the effect generalizes across consonant contexts and to different vowels. 1 INTRODUCTION For more than a century, it has been know that humans will adapt their reaches to altered visual feedback [8]. One of the most studied examples of this adaptation is prism adaptation, which is seen when a subject reaches to targets while wearing image-shifting prism glasses [2].

adaptation, experiment, generalization, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.06)
North America > United States > New York > Monroe County > Rochester (0.04)

Genre: Research Report > New Finding (0.70)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.42)
Information Technology > Artificial Intelligence > Speech (0.34)

Dailey, Matthew N., Cottrell, Garrison W.

Task and Spatial Frequency Effects on Face Specialization

There is strong evidence that face processing is localized in the brain. The double dissociation between prosopagnosia, a face recognition deficit occurring after brain damage, and visual object agnosia, difficulty recognizing otber kinds of complex objects, indicates tbat face and nonface object recognition may be served by partially independent mechanisms in the brain. Is neural specialization innate or learned? We suggest that this specialization could be tbe result of a competitive learning mechanism that, during development, devotes neural resources to the tasks they are best at performing. Furtber, we suggest that the specialization arises as an interaction between task requirements and developmental constraints. In this paper, we present a feed-forward computational model of visual processing, in which two modules compete to classify input stimuli. When one module receives low spatial frequency information and the other receives high spatial frequency information, and the task is to identify the faces while simply classifying the objects, the low frequency network shows a strong specialization for faces. No otber combination of tasks and inputs shows this strong specialization. We take these results as support for the idea that an innately-specified face processing module is unnecessary.

face recognition, information, specialization, (13 more...)