Directed Networks
Sparse Code Shrinkage: Denoising by Nonlinear Maximum Likelihood Estimation
Hyvรคrinen, Aapo, Hoyer, Patrik O., Oja, Erkki
Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reductionand independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. Furthermore, we show how to choose the optimal sparse coding basis for denoising.
Learning from Dyadic Data
Hofmann, Thomas, Puzicha, Jan, Jordan, Michael I.
Dyadzc data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This type of data arises naturally in many application rangingfrom computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework of learning fromdyadic data by statistical mixture models. Our approach covers different models with fiat and hierarchical latent class structures. Wepropose an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains. 1 Introduction Over the past decade learning from data has become a highly active field of research distributedover many disciplines like pattern recognition, neural computation, statistics,machine learning, and data mining.
Approximate Learning of Dynamic Models
Inference is a key component in learning probabilistic models from partially observabledata. When learning temporal models, each of the many inference phases requires a traversal over an entire long data sequence; furthermore,the data structures manipulated are exponentially large, making this process computationally expensive. In [2], we describe an approximate inference algorithm for monitoring stochastic processes, and prove bounds on its approximation error. In this paper, we apply this algorithm as an approximate forward propagation step in an EM algorithm for learning temporal Bayesian networks. We provide a related approximation forthe backward step, and prove error bounds for the combined algorithm.
Bayesian PCA
The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this paper we use this probabilistic reformulation as the basis for a Bayesian treatment of PCA. Our key result is that effective dimensionalityof the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure. An important application of this framework is to mixtures of probabilistic PCA models, in which each component can determine its own effective complexity. 1 Introduction Principal component analysis (PCA) is a widely used technique for data analysis. Recently Tipping and Bishop (1997b) showed that a specific form of generative latent variable model has the property that its maximum likelihood solution extracts the principal subspace of the observed data set.
Mean Field Methods for Classification with Gaussian Processes
We discuss the application of TAP mean field methods known from the Statistical Mechanics of disordered systems to Bayesian classification modelswith Gaussian processes. In contrast to previous approaches, noknowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given. They have been recently introduced into the Neural Computation community (Neal 1996, Williams & Rasmussen 1996, Mackay 1997). If we assume fields with zero prior mean, the statistics of h is entirely defined by the second order correlations C(s, S') E[h(s)h(S')], where E denotes expectations 310 MOpper and 0. Winther with respect to the prior. Interesting examples are C(s, s') (1) C(s, s') (2) The choice (1) can be motivated as a limit of a two-layered neural network with infinitely many hidden units with factorizable input-hidden weight priors (Williams 1997).
Bayesian Modeling of Human Concept Learning
I consider the problem of learning concepts from small numbers of positive examples,a feat which humans perform routinely but which computers arerarely capable of. Bridging machine learning and cognitive science perspectives, I present both theoretical analysis and an empirical study with human subjects for the simple task oflearning concepts corresponding toaxis-aligned rectangles in a multidimensional feature space. Existing learning models, when applied to this task, cannot explain how subjects generalize from only a few examples of the concept. I propose a principled Bayesian model based on the assumption that the examples are a random sample from the concept to be learned. The model gives precise fits to human behavior on this simple task and provides qualitati ve insights into more complex, realistic cases of concept learning.
Inference in Bayesian Networks
A Bayesian network is a compact, expressive representation of uncertain relationships among parameters in a domain. In this article, I introduce basic methods for computing with Bayesian networks, starting with the simple idea of summing the probabilities of events of interest. The article introduces major current methods for exact computation, briefly surveys approximation methods, and closes with a brief discussion of open issues.
An Overview of Some Recent Developments in Bayesian Problem-Solving Techniques
The last few years have seen a surge in interest in the use of techniques from Bayesian decision theory to address problems in AI. Decision theory provides a normative framework for representing and reasoning about decision problems under uncertainty. Within the context of this framework, researchers in uncertainty in the AI community have been developing computational techniques for building rational agents and representations suited to engineering their knowledge bases. This special issue reviews recent research in Bayesian problem-solving techniques. The articles cover the topics of inference in Bayesian networks, decision-theoretic planning, and qualitative decision theory. Here, I provide a brief introduction to Bayesian networks and then cover applications of Bayesian problem-solving techniques, knowledge-based model construction and structured representations, and the learning of graphic probability models.