AITopics

Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reductionand independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. Furthermore, we show how to choose the optimal sparse coding basis for denoising.

artificial intelligence, machine learning, sparse, (15 more...)

Country: Europe > Finland (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Hofmann, Thomas, Puzicha, Jan, Jordan, Michael I.

Learning from Dyadic Data

Dyadzc data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This type of data arises naturally in many application rangingfrom computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework of learning fromdyadic data by statistical mixture models. Our approach covers different models with fiat and hierarchical latent class structures. Wepropose an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains. 1 Introduction Over the past decade learning from data has become a highly active field of research distributedover many disciplines like pattern recognition, neural computation, statistics,machine learning, and data mining.

artificial intelligence, machine learning, natural language, (19 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Grandvalet, Yves, Canu, Stéphane

Outcomes of the Equivalence of Adaptive Ridge with Least Absolute Shrinkage

In supervised learning, we have a set of explicative variables x from which we wish to predict aresponse variable y.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country:

North America > United States (0.15)
Europe > France (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Ghahramani, Zoubin, Roweis, Sam T.

Learning Nonlinear Dynamical Systems Using an EM Algorithm

The Expectation-Maximization (EM) algorithm is an iterative procedure formaximum likelihood parameter estimation from data sets with missing or hidden variables [2]. It has been applied to system identification in linear stochastic state-space models, where the state variables are hidden from the observer and both the state and the parameters of the model have to be estimated simultaneously [9].We present a generalization of the EM algorithm for parameter estimation in nonlinear dynamical systems. The "expectation" stepmakes use of Extended Kalman Smoothing to estimate the state, while the "maximization" step re-estimates the parameters usingthese uncertain state estimates. In general, the nonlinear maximization step is difficult because it requires integrating out the uncertainty in the states. However, if Gaussian radial basis function (RBF)approximators are used to model the nonlinearities, the integrals become tractable and the maximization step can be solved via systems of linear equations.

algorithm, artificial intelligence, machine learning, (13 more...)

Country: North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Friedman, Nir, Singer, Yoram

Efficient Bayesian Parameter Estimation in Large Discrete Domains

Dirichlet distribution (see for instance [4]).

artificial intelligence, bayesian inference, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Boyen, Xavier, Koller, Daphne

Approximate Learning of Dynamic Models

Inference is a key component in learning probabilistic models from partially observabledata. When learning temporal models, each of the many inference phases requires a traversal over an entire long data sequence; furthermore,the data structures manipulated are exponentially large, making this process computationally expensive. In [2], we describe an approximate inference algorithm for monitoring stochastic processes, and prove bounds on its approximation error. In this paper, we apply this algorithm as an approximate forward propagation step in an EM algorithm for learning temporal Bayesian networks. We provide a related approximation forthe backward step, and prove error bounds for the combined algorithm.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Blake, Andrew, North, Ben, Isard, Michael

Learning Multi-Class Dynamics

Yule-Walker) are available for learning Auto-Regressive process models of simple, directly observable, dynamical processes.When sensor noise means that dynamics are observed only approximately, learning can still been achieved via Expectation-Maximisation (EM) together with Kalman Filtering. However, this does not handle more complex dynamics, involving multiple classes of motion.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Bayesian PCA

Bishop, Christopher M.

The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this paper we use this probabilistic reformulation as the basis for a Bayesian treatment of PCA. Our key result is that effective dimensionalityof the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure. An important application of this framework is to mixtures of probabilistic PCA models, in which each component can determine its own effective complexity. 1 Introduction Principal component analysis (PCA) is a widely used technique for data analysis. Recently Tipping and Bishop (1997b) showed that a specific form of generative latent variable model has the property that its maximum likelihood solution extracts the principal subspace of the observed data set.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: Europe > United Kingdom > England (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Opper, Manfred, Winther, Ole

Mean Field Methods for Classification with Gaussian Processes

We discuss the application of TAP mean field methods known from the Statistical Mechanics of disordered systems to Bayesian classification modelswith Gaussian processes. In contrast to previous approaches, noknowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given. They have been recently introduced into the Neural Computation community (Neal 1996, Williams & Rasmussen 1996, Mackay 1997). If we assume fields with zero prior mean, the statistics of h is entirely defined by the second order correlations C(s, S') E[h(s)h(S')], where E denotes expectations 310 MOpper and 0. Winther with respect to the prior. Interesting examples are C(s, s') (1) C(s, s') (2) The choice (1) can be motivated as a limit of a two-layered neural network with infinitely many hidden units with factorizable input-hidden weight priors (Williams 1997).

artificial intelligence, classification, machine learning, (17 more...)

Country:

Europe > Denmark (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom (0.14)
Europe > Sweden (0.14)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Gat, Itay, Tishby, Naftali

Synergy and Redundancy among Brain Cells of Behaving Monkeys

While it is unlikely that complete information from any macroscopic neural tissue will ever be available, some interesting insight can be obtained from simultaneously recorded cells in the cortex of behaving animals. The question we address in this study is the level of synergy, or the level of cooperation, among brain cells, as determined by the information they provide about the observed behavior of the animal.

artificial intelligence, information, machine learning, (16 more...)

Country: Asia > Middle East > Israel (0.15)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)