AITopics

Country:

North America > United States > Massachusetts (0.14)
North America > United States > Pennsylvania (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Neural Information Processing SystemsDec-31-2002

Multiplicative Updates for Classification by Mixture Models

Saul, Lawrence K., Lee, Daniel D.

We investigate a learning algorithm for the classification of nonnegative data by mixture models. Multiplicative update rules are derived that directly optimize the performance of these models as classifiers. The update rules have a simple closed form and an intuitive appeal. Our algorithm retains the main virtues of the Expectation-Maximization (EM) algorithm--its guarantee of monotonic improvement, and its absence of tuning parameters--with the added advantage of optimizing a discriminative objective function. The algorithm reduces as a special case to the method of generalized iterative scaling for log-linear models. The learning rate of the algorithm is controlled by the sparseness of the training data. We use the method of nonnegative matrix factorization (NMF) to discover sparse distributed representations of the data. This form of feature selection greatly accelerates learning and makes the algorithm practical on large problems. Experiments show that discriminatively trained mixture models lead to much better classification than comparably sized models trained by EM.

artificial intelligence, machine learning, null null, (15 more...)

Country:

North America > United States > Massachusetts (0.14)
North America > United States > Pennsylvania (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Neural Information Processing SystemsDec-31-2002

Multiplicative Updates for Classification by Mixture Models

Saul, Lawrence K., Lee, Daniel D.

We investigate a learning algorithm for the classification of nonnegative data by mixture models. Multiplicative update rules are derived that directly optimize the performance of these models as classifiers. The update rules have a simple closed form and an intuitive appeal. Our algorithm retains the main virtues of the Expectation-Maximization (EM) algorithm--its guarantee of monotonic improvement, andits absence of tuning parameters--with the added advantage of optimizing a discriminative objective function. The algorithm reduces as a special caseto the method of generalized iterative scaling for log-linear models. The learning rate of the algorithm is controlled by the sparseness of the training data. We use the method of nonnegative matrix factorization (NMF) to discover sparse distributed representations of the data. This form of feature selection greatly accelerates learning and makes the algorithm practical on large problems. Experiments showthat discriminatively trained mixture models lead to much better classification than comparably sized models trained by EM.

artificial intelligence, log likelihood, machine learning, (16 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Neural Information Processing SystemsDec-31-2001

An Information Maximization Approach to Overcomplete and Recurrent Representations

Shriki, Oren, Sompolinsky, Haim, Lee, Daniel D.

The principle of maximizing mutual information is applied to learning overcomplete and recurrent representations. The underlying model consists of a network of input units driving a larger number of output units with recurrent interactions. In the limit of zero noise, the network is deterministic and the mutual information can be related to the entropy of the output units.

artificial intelligence, machine learning, mutual information, (16 more...)

Country: Asia > Middle East > Israel (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-31-2001

Algorithms for Non-negative Matrix Factorization

Lee, Daniel D., Seung, H. Sebastian

Non-negative matrix factorization (NMF) has previously been shown to be a useful decomposition for multivariate data. Two different multi- plicative algorithms for NMF are analyzed. They differ only slightly in the multiplicative factor used in the update rules. One algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence. The monotonic convergence of both algorithms can be proven using an auxiliary func- tion analogous to that used for proving convergence of the Expectation- Maximization algorithm. The algorithms can also be interpreted as diag- onally rescaled gradient descent, where the rescaling factor is optimally chosen to ensure convergence.

artificial intelligence, machine learning, update rule, (15 more...)

Country:

North America > United States (0.29)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Neural Information Processing SystemsDec-31-2001

An Information Maximization Approach to Overcomplete and Recurrent Representations

Shriki, Oren, Sompolinsky, Haim, Lee, Daniel D.

The principle of maximizing mutual information is applied to learning overcomplete and recurrent representations.

artificial intelligence, machine learning, representation, (15 more...)

Country: Asia > Middle East > Israel (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

The Nonnegative Boltzmann Machine

Downs, Oliver B., MacKay, David J. C., Lee, Daniel D.

The nonnegative Boltzmann machine (NNBM) is a recurrent neural network model that can describe multimodal nonnegative data. Application of maximum likelihood estimation to this model gives a learning rule that is analogous to the binary Boltzmann machine. We examine the utility of the mean field approximation for the NNBM, and describe how Monte Carlo sampling techniques can be used to learn its parameters. Reflective slice sampling is particularly well-suited for this distribution, and can efficiently be implemented to sample the distribution. We illustrate learning of the NNBM on a transiationally invariant distribution, as well as on a generative model for images of human faces. Introduction The multivariate Gaussian is the most elementary distribution used to model generic data. It represents the maximum entropy distribution under the constraint that the mean and covariance matrix of the distribution match that of the data. For the case of binary data, the maximum entropy distribution that matches the first and second order statistics of the data is given by the Boltzmann machine [1].

artificial intelligence, machine learning, nnbm distribution, (15 more...)

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Algorithms for Independent Components Analysis and Higher Order Statistics

Lee, Daniel D., Rokni, Uri, Sompolinsky, Haim

A latent variable generative model with finite noise is used to describe severaldifferent algorithms for Independent Components Analysis (lCA). In particular, the Fixed Point ICA algorithm is shown to be equivalent to the Expectation-Maximization algorithm for maximum likelihood under certain constraints, allowing the conditions for global convergence to be elucidated. The algorithms can also be explained by their generic behavior near a singular point where the size of the optimal generativebases vanishes. An expansion of the likelihood about this singular point indicates the role of higher order correlations in determining thefeatures discovered by ICA. The application and convergence of these algorithms are demonstrated on a simple illustrative example.

algorithm, artificial intelligence, bayesian inference, (14 more...)

Country: Asia > Middle East > Israel (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

The Nonnegative Boltzmann Machine

Downs, Oliver B., MacKay, David J. C., Lee, Daniel D.

The nonnegative Boltzmann machine (NNBM) is a recurrent neural network modelthat can describe multimodal nonnegative data. Application ofmaximum likelihood estimation to this model gives a learning rule that is analogous to the binary Boltzmann machine. We examine the utility of the mean field approximation for the NNBM, and describe how Monte Carlo sampling techniques can be used to learn its parameters. Reflective slicesampling is particularly well-suited for this distribution, and can efficiently be implemented to sample the distribution. We illustrate learning of the NNBM on a transiationally invariant distribution, as well as on a generative model for images of human faces. Introduction The multivariate Gaussian is the most elementary distribution used to model generic data.

artificial intelligence, machine learning, nnbm distribution, (15 more...)

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.97)

The Nonnegative Boltzmann Machine

Downs, Oliver B., MacKay, David J. C., Lee, Daniel D.

The nonnegative Boltzmann machine (NNBM) is a recurrent neural network model that can describe multimodal nonnegative data. Application of maximum likelihood estimation to this model gives a learning rule that is analogous to the binary Boltzmann machine. We examine the utility of the mean field approximation for the NNBM, and describe how Monte Carlo sampling techniques can be used to learn its parameters. Reflective slice sampling is particularly well-suited for this distribution, and can efficiently be implemented to sample the distribution. We illustrate learning of the NNBM on a transiationally invariant distribution, as well as on a generative model for images of human faces. Introduction The multivariate Gaussian is the most elementary distribution used to model generic data. It represents the maximum entropy distribution under the constraint that the mean and covariance matrix of the distribution match that of the data. For the case of binary data, the maximum entropy distribution that matches the first and second order statistics of the data is given by the Boltzmann machine [1].

artificial intelligence, machine learning, nnbm distribution, (15 more...)