AITopics

I present a simple variation of importance sampling that explicitly searches forimportant regions in the target distribution. I prove that the technique yieldsunbiased estimates, and show empirically it can reduce the variance of standard Monte Carlo estimators. This is achieved by concentrating samplesin more significant regions of the sample space. 1 Introduction It is well known that general inference and learning with graphical models is computationally hard[1] and it is therefore necessary to consider restricted architectures [13], or approximate algorithms to perform these tasks [3, 7]. Among the most convenient and successful techniques are stochastic methods which are guaranteed to converge to a correct solution in the limit oflarge samples [10, 11, 12, 15]. These methods can be easily applied to complex inference problems that overwhelm deterministic approaches.

estimator, procedure, random variable, (16 more...)

Country:

North America > United States > New York (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

The Infinite Gaussian Mixture Model

Rasmussen, Carl Edward

In a Bayesian mixture model it is not necessary a priori to limit the number ofcomponents to be finite. In this paper an infinite Gaussian mixture model is presented which neatly sidesteps the difficult problem of finding the"right" number of mixture components. Inference in the model is done using an efficient parameter-free Markov Chain that relies entirely on Gibbs sampling.

indicator, mixture model, unrepresented class, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)
North America > United States > New York (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

A Multi-class Linear Learning Algorithm Related to Winnow

Mesterharm, Chris

Committee is an algorithm forcombining the predictions of a set of sub-experts in the online mistake-bounded model oflearning. A sub-expert is a special type of attribute that predicts with a distribution over a finite number of classes. Committee learns a linear function of sub-experts and uses this function to make class predictions. We provide bounds for Committee that show it performs well when the target can be represented by a few relevant sub-experts. We also show how Committee can be used to solve more traditional problems composed of attributes. This leads to a natural extension thatlearns on multi-class problems that contain both traditional attributes and sub-experts.

algorithm, transformation, winnow algorithm, (15 more...)

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

The Relaxed Online Maximum Margin Algorithm

Li, Yi, Long, Philip M.

We describe a new incremental algorithm for training linear threshold functions:the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctlywith the maximum margin. It is known that such a maximum-margin hypothesis can be computed by minimizing the length of the weight vector subject to a number of linear constraints. ROMMA works by maintaining a relatively simple relaxation of these constraints that can be efficiently updated. We prove a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm. Our analysis implies that the more computationally intensive maximum-margin algorithm alsosatisfies this mistake bound; this is the first worst-case performance guaranteefor this algorithm. We describe some experiments using ROMMA and a variant that updates its hypothesis more aggressively as batch algorithms to recognize handwritten digits. The computational complexity and simplicity of these algorithms is similar to that of perceptron algorithm,but their generalization is much better. We describe a sense in which the performance of ROMMA converges to that of SVM in the limit if bias isn't considered.

algorithm, perceptron algorithm, romma, (9 more...)

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Nadeau, Claude, Bengio, Yoshua

Inference for the Generalization Error

In order to to compare learning algorithms, experimental results reported in the machine learning litterature often use statistical tests of significance. Unfortunately,most of these tests do not take into account the variability due to the choice of training set. We perform a theoretical investigation of the variance of the cross-validation estimate of the generalization errorthat takes into account the variability due to the choice of training sets. This allows us to propose two new ways to estimate this variance. We show, via simulations, that these new statistics perform well relative to the statistics considered by Dietterich (Dietterich, 1998). 1 Introduction When applying a learning algorithm (or comparing several algorithms), one is typically interested in estimating its generalization error. Its point estimation is rather trivial through cross-validation. Providing a variance estimate of that estimation, so that hypothesis testing and/orconfidence intervals are possible, is more difficult, especially, as pointed out in (Hinton et aI., 1995), if one wants to take into account the variability due to the choice of the training sets (Breiman, 1996). A notable effort in that direction is Dietterich's work (Dietterich, 1998).Careful investigation of the variance to be estimated allows us to provide new variance estimates, which tum out to perform well. Let us first layout the framework in which we shall work.

algorithm, dietterich, variance, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Computation with Winner-Take-All as the Only Nonlinear Operation

Maass, Wolfgang

Everybody "knows" that neural networks need more than a single layer ofnonlinear units to compute interesting functions. We show that this is false if one employs winner-take-all as nonlinear unit: - Any boolean function can be computed by a single k-winner-takeall unitapplied to weighted sums of the input variables.

computational power, neural computation, threshold gate, (13 more...)

Country:

Europe > Austria > Styria > Graz (0.06)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Schneidman, Elad, Segev, Idan, Tishby, Naftali

Information Capacity and Robustness of Stochastic Neuron Models

The reliability and accuracy of spike trains have been shown to depend on the nature of the stimulus that the neuron encodes.

information, neuron, spike train, (15 more...)

Country: Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology: Information Technology > Artificial Intelligence (0.68)

Renart, Alfonso, Parga, Néstor, Rolls, Edmund T.

A Recurrent Model of the Interaction Between Prefrontal and Inferotemporal Cortex in Delay Tasks

A very simple model of two reciprocally connected attractor neural networks isstudied analytically in situations similar to those encountered in delay match-to-sample tasks with intervening stimuli and in tasks of memory guided attention. The model qualitatively reproduces many of the experimental data on these types of tasks and provides a framework for the understanding of the experimental observations in the context of the attractor neural network scenario.

module, neuron, stimulus, (16 more...)

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Rao, Rajesh P. N., Sejnowski, Terrence J.

Predictive Sequence Learning in Recurrent Neocortical Circuits

The neocortex is characterized by an extensive system of recurrent excitatory connections between neurons in a given area.

neuron, predictive sequence learning, synapse, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

Eurich, Christian W., Wilke, Stefan D., Schwegler, Helmut

Neural Representation of Multi-Dimensional Stimuli

Spatial information comes in two forms: direct spatial information (for example, retinal position) and indirect temporal contiguity information, since objects encountered sequentially are in general spatially close.

information, layout, spatial information, (17 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > Bremen > Bremen (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)