AITopics

Country: Europe (0.28)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.32)

Hochreiter, Sepp, Schmidhuber, Jürgen

Source Separation as a By-Product of Regularization

This paper reveals a previously ignored connection between two important fields: regularization and independent component analysis (ICA).We show that at least one representative of a broad class of algorithms (regularizers that reduce network complexity) extracts independent features as a byproduct. This algorithm is Flat Minimum Search (FMS), a recent general method for finding low-complexity networks with high generalization capability. FMS works by minimizing both training error and required weight precision. Accordingto our theoretical analysis the hidden layer of an FMStrained autoassociator attempts at coding each input by a sparse code with as few simple features as possible. In experiments themethod extracts optimal codes for difficult versions of the "noisy bars" benchmark problem by separating the underlying sources, whereas ICA and PCA fail.

artificial intelligence, lococode, machine learning, (16 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Learning a Hierarchical Belief Network of Independent Factor Analyzers

Attias, Hagai

The model parameters are learned in an unsupervised manner by maximizing the likelihood that these data are generated by the model.

Denève, Sophie, Pouget, Alexandre, Latham, Peter E.

Divisive Normalization, Line Attractor Networks and Ideal Observers

Using simulations, we show that divisive normalization is a close approximation to a maximum likelihood estimator, which, in the context of population coding, is the same as an ideal observer. We also demonstrate analytically thatthis is a general property of a large class of nonlinear recurrent networks with line attractors. Our work suggests that divisive normalization plays a critical role in noise filtering, and that every cortical layer may be an ideal observer of the activity in the preceding layer. Information processing in the cortex is often formalized as a sequence of a linear stages followed by a nonlinearity. In the visual cortex, the nonlinearity is best described bysquaring combined with a divisive pooling of local activities.

artificial intelligence, machine learning, normalization, (13 more...)

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.38)

Magdon-Ismail, Malik, Atiya, Amir F.

Neural Networks for Density Estimation

Although quantities such as the mean, the variance, and possibly higher order moments of a random variable have often been sufficient to characterize a particular problem, the quest for higher modeling accuracy, and for more realistic assumptions drives us towards modeling the available random variables using their probability density. This of course leads us to the problem of density estimation (see [6]). The most common approach for density estimation is the nonparametric approach, where the density is determined according to a formula involving the data points available. The most common non parametric methods are the kernel density estimator, alsoknown as the Parzen window estimator [4] and the k-nearest neighbor technique [1]. Non parametric density estimation belongs to the class of ill-posed problems in the sense that small changes in the data can lead to large changes in "To whom correspondence should be addressed.

artificial intelligence, distribution function, machine learning, (15 more...)

Country: North America > United States > California (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Jr., Charles Lee Isbell, Viola, Paul A.

Restructuring Sparse High Dimensional Data for Effective Retrieval

The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector-a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique forextracting underlying structure from the document collection.

artificial intelligence, machine learning, natural language, (15 more...)

Country:

Africa (0.30)
North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Vasconcelos, Nuno, Lippman, Andrew

Learning Mixture Hierarchies

The hierarchical representation of data has various applications in domains suchas data mining, machine vision, or information retrieval. In this paper we introduce an extension of the Expectation-Maximization (EM) algorithm that learns mixture hierarchies in a computationally efficient manner.Efficiency is achieved by progressing in a bottom-up fashion, i.e. by clustering the mixture components of a given level in the hierarchy to obtain those of the level above. This clustering requires only knowledge of the mixture parameters, there being no need to resort to intermediate samples. In addition to practical applications, the algorithm allows a new interpretation of EM that makes clear the relationship with nonparametric kernel-based estimation methods, provides explicit control overthe tradeoff between the bias and variance of EM estimates, and offers new insights about the behavior ofdeterministic annealing methods commonly used with EM to escape local minima of the likelihood.

artificial intelligence, hierarchy, machine learning, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

Temporally Asymmetric Hebbian Learning, Spike liming and Neural Response Variability

Abbott, L. F., Song, Sen

Recent experimental data indicate that the strengthening or weakening of synaptic connections between neurons depends on the relative timing of pre-and postsynaptic action potentials. A Hebbian synaptic modification rule based on these data leads to a stable state in which the excitatory and inhibitory inputs to a neuron are balanced, producing an irregular pattern of firing. It has been proposed that neurons in vivo operate in such a mode.

artificial intelligence, firing mode, machine learning, (15 more...)

Country: North America > United States (0.47)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Ikeda, Shiro, Amari, Shun-ichi, Nakahara, Hiroyuki

Convergence of the Wake-Sleep Algorithm

The WS (Wake-Sleep) algorithm is a simple learning rule for the models with hidden variables. It is shown that this algorithm can be applied to a factor analysis model which is a linear version of the Helmholtz machine. Buteven for a factor analysis model, the general convergence is not proved theoretically. In this article, we describe the geometrical understanding ofthe WS algorithm in contrast with the EM (Expectation Maximization) algorithm and the em algorithm. As the result, we prove the convergence of the WS algorithm for the factor analysis model. We also show the condition for the convergence in general models.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: Asia > Japan (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Using Analytic QP and Sparseness to Speed Training of Support Vector Machines

Platt, John C.

SVMs have empirically been shown to give good generalization performance on a wide variety of problems. However, the use of SVMs is stilI limited to a small group of researchers. One possible reason is that training algorithms for SVMs are slow, especially for large problems. Another explanation is that SVM training algorithms are complex, subtle, and sometimes difficult to implement. This paper describes a new SVM learning algorithm that is easy to implement, often faster, and has better scaling properties than the standard SVM training algorithm. The new SVM learning algorithm is called Sequential Minimal Optimization (or SMO).

algorithm, artificial intelligence, machine learning, (15 more...)

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)