AITopics | Neural Information Processing Systems

Collaborating Authors

Neural Information Processing Systems

On a Modification to the Mean Field EM Algorithm in Factorial Learning

Neural Information Processing SystemsDec-31-1997

A modification is described to the use of mean field approximations inthe E step of EM algorithms for analysing data from latent structure models, as described by Ghahramani (1995), among others. Themodification involves second-order Taylor approximations to expectations computed in the E step. The potential benefits of the method are illustrated using very simple latent profile models. 1 Introduction Ghahramani (1995) advocated the use of mean field methods as a means to avoid the heavy computation involved in the E step of the EM algorithm used for estimating parameters within a certain latent structure model, and Ghahramani & Jordan (1995) used the same ideas in a more complex situation. Dunmur & Titterington (1996a) identified Ghahramani's model as a so-called latent profile model, they observed that Zhang (1992,1993) had used mean field methods for a similar purpose, and they showed, in a simulation study based on very simple examples, that the mean field version of the EM algorithm often performed very respectably. By this it is meant that, when data were generated from the model under analysis, the estimators of the underlying parameters were efficient, judging by empirical results, especially in comparison with estimators obtained by employing the'correct' EM algorithm: the examples therefore had to be simple enough that the correct EM algorithm is numerically feasible, although any success reported for the mean field 432 A. P. Dunmur and D. M. Titterington version is, one hopes, an indication that the method will also be adequate in more complex situations in which the correct EM algorithm is not implementable because of computational complexity. In spite of the above positive remarks, there were circumstances in which there was a perceptible, if not dramatic, lack of efficiency in the simple (naive) mean field estimators, and the objective of this contribution is to propose and investigate ways of refining the method so as to improve performance without detracting from the appealing, and frequently essential, simplicity of the approach. The procedure used here is based on a second order correction to the naive mean field well known in statistical physics and sometimes called the cavity or TAP method (Mezard, Parisi & Virasoro, 1987). It has been applied recently in cluster analysis (Hofmann & Buhmann, 1996). In Section 2 we introduce the structure of our model, Section 3 explains the refined mean field approach, Section 4 provides numerical results, and Section 5 contains a statement of our conclusions.

artificial intelligence, latent variable, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Africa > South Africa > Western Cape > Indian Ocean (0.41)
North America > United States (0.28)
Asia > Middle East > Jordan (0.25)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Multi-Grid Methods for Reinforcement Learning in Controlled Diffusion Processes

Pareigis, Stephan

Neural Information Processing SystemsDec-31-1997

The optimal control problem reduces to a boundary value problem for a fully nonlinear second-order elliptic differential equation of Hamilton Jacobi-Bellman (HJB-) type. Numerical analysis provides multigrid methodsfor this kind of equation. In the case of Learning Control, however,the systems of equations on the various grid-levels are obtained using observed information (transitions and local cost). To ensure consistency, special attention needs to be directed toward thetype of time and space discretization during the observation. Analgorithm for multi-grid observation is proposed. The multi-grid algorithm is demonstrated on a simple queuing problem. 1 Introduction Controlled Diffusion Processes (CDP) are the analogy to Markov Decision Problems in continuous state space and continuous time.

algorithm, artificial intelligence, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Combinations of Weak Classifiers

Ji, Chuanyi, Ma, Sheng

Neural Information Processing SystemsDec-31-1997

To obtain classification systems with both good generalization performance andefficiency in space and time, we propose a learning method based on combinations of weak classifiers, where weak classifiers arelinear classifiers (perceptrons) which can do a little better than making random guesses. A randomized algorithm is proposed to find the weak classifiers. They· are then combined through a majority vote.As demonstrated through systematic experiments, the method developed is able to obtain combinations of weak classifiers with good generalization performance and a fast training time on a variety of test problems and real applications.

artificial intelligence, classifier, neural network, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre: Research Report (0.47)

Industry: Education (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.73)

Add feedback

Improving the Accuracy and Speed of Support Vector Machines

Burges, Christopher J. C., Schölkopf, Bernhard

Neural Information Processing SystemsDec-31-1997

Support Vector Learning Machines (SVM) are finding application in pattern recognition, regression estimation, and operator inversion forill-posed problems. Against this very general backdrop, any methods for improving the generalization performance, or for improving the speed in test phase, of SVMs are of increasing interest. Inthis paper we combine two such techniques on a pattern recognition problem. The method for improving generalization performance (the"virtual support vector" method) does so by incorporating known invariances of the problem. This method achieves a drop in the error rate on 10,000 NIST test digit images of 1.4% to 1.0%.

artificial intelligence, machine learning, vector, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.28)
North America > United States > California > San Mateo County (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Spatiotemporal Coupling and Scaling of Natural Images and Human Visual Sensitivities

Dong, Dawei W.

Neural Information Processing SystemsDec-31-1997

We study the spatiotemporal correlation in natural time-varying images and explore the hypothesis that the visual system is concerned withthe optimal coding of visual representation through spatiotemporal decorrelation of the input signal. Based on the measured spatiotemporal power spectrum, the transform needed to decorrelate input signal is derived analytically and then compared with the actual processing observed in psychophysical experiments.

artificial intelligence, frequency, power spectrum, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (0.47)
Information Technology > Data Science (0.34)

Add feedback

ARC-LH: A New Adaptive Resampling Algorithm for Improving ANN Classifiers

Leisch, Friedrich, Hornik, Kurt

Neural Information Processing SystemsDec-31-1997

Further im- 528 F.Leisch and K. Hornik provements should be possible based on a better understanding of the theoretical properties of resample and combine techniques. These issues are currently being investigated.

artificial intelligence, classifier, neural network, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California > Alameda County > Berkeley (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dynamics of Training

Bös, Siegfried, Opper, Manfred

Neural Information Processing SystemsDec-31-1997

A new method to calculate the full training process of a neural network isintroduced. No sophisticated methods like the replica trick are used. The results are directly related to the actual number of training steps. Some results are presented here, like the maximal learning rate, an exact description of early stopping, and the necessary numberof training steps. Further problems can be addressed with this approach.

artificial intelligence, neural network, training step, (16 more...)

Neural Information Processing Systems

Country: Asia > Japan (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

3D Object Recognition: A Model of View-Tuned Neurons

Bricolo, Emanuela, Poggio, Tomaso, Logothetis, Nikos K.

Neural Information Processing SystemsDec-31-1997

Recognition of specific objects, such as recognition of a particular face, can be based on representations that are object centered, such as 3D structural models. Alternatively, a 3D object may be represented for the purpose of recognition in terms of a set of views. This latter class of models is biologically attractive because model acquisition - the learning phase - is simpler and more natural. A simple model for this strategy of object recognition was proposed by Poggio and Edelman (Poggio and Edelman, 1990). They showed that, with few views of an object usedas training examples, a classification network, such as a Gaussian radial basis function network, can learn to recognize novel views of that object, in partic- 42 E.Bricolo, T. Poggio and N. Logothetis (a) (b) View angle Figure 1: (a) Schematic representation of the architecture of the Poggio-Edelman model. The shaded circles correspond to the view-tuned units, each tuned to a view of the object, while the open circle correspond to the view-invariant, object specific output unit.

health & medicine, inductive learning, representation, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.34)

Add feedback

Interpolating Earth-science Data using RBF Networks and Mixtures of Experts

Wan, Ernest, Bone, Don

Neural Information Processing SystemsDec-31-1997

We present a mixture of experts (ME) approach to interpolate sparse, spatially correlated earth-science data. Kriging is an interpolation method which uses a global covariation model estimated from the data to take account of the spatial dependence in the data. Based on the close relationship between kriging and the radial basis function (RBF) network (Wan & Bone, 1996), we use a mixture of generalized RBF networks to partition the input space into statistically correlated regions and learn the local covariation model of the data in each region. Applying the ME approach to simulated and real-world data, we show that it is able to achieve good partitioning of the input space, learn the local covariation models and improve generalization.

artificial intelligence, covariation model, machine learning, (16 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Learning Exact Patterns of Quasi-synchronization among Spiking Neurons from Data on Multi-unit Recordings

Martignon, Laura, Laskey, Kathryn B., Deco, Gustavo, Vaadia, Eilon

Neural Information Processing SystemsDec-31-1997

This paper develops arguments for a family of temporal log-linear models to represent spatiotemporal correlations among the spiking events in a group of neurons. The models can represent not just pairwise correlations but also correlations of higher order. Methods are discussed for inferring the existence or absence of correlations and estimating their strength. A frequentist and a Bayesian approach to correlation detection are compared.

bayesian inference, health & medicine, interaction, (16 more...)

Neural Information Processing Systems

Country: