Bayesian Inference
Sparse Code Shrinkage: Denoising by Nonlinear Maximum Likelihood Estimation
Hyvรคrinen, Aapo, Hoyer, Patrik O., Oja, Erkki
Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reductionand independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. Furthermore, we show how to choose the optimal sparse coding basis for denoising.
Learning from Dyadic Data
Hofmann, Thomas, Puzicha, Jan, Jordan, Michael I.
Dyadzc data refers to a domain with two finite sets of objects in which observations are made for dyads, i.e., pairs with one element from either set. This type of data arises naturally in many application rangingfrom computational linguistics and information retrieval to preference analysis and computer vision. In this paper, we present a systematic, domain-independent framework of learning fromdyadic data by statistical mixture models. Our approach covers different models with fiat and hierarchical latent class structures. Wepropose an annealed version of the standard EM algorithm for model fitting which is empirically evaluated on a variety of data sets from different domains. 1 Introduction Over the past decade learning from data has become a highly active field of research distributedover many disciplines like pattern recognition, neural computation, statistics,machine learning, and data mining.
Bayesian PCA
The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this paper we use this probabilistic reformulation as the basis for a Bayesian treatment of PCA. Our key result is that effective dimensionalityof the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure. An important application of this framework is to mixtures of probabilistic PCA models, in which each component can determine its own effective complexity. 1 Introduction Principal component analysis (PCA) is a widely used technique for data analysis. Recently Tipping and Bishop (1997b) showed that a specific form of generative latent variable model has the property that its maximum likelihood solution extracts the principal subspace of the observed data set.
Mean Field Methods for Classification with Gaussian Processes
We discuss the application of TAP mean field methods known from the Statistical Mechanics of disordered systems to Bayesian classification modelswith Gaussian processes. In contrast to previous approaches, noknowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given. They have been recently introduced into the Neural Computation community (Neal 1996, Williams & Rasmussen 1996, Mackay 1997). If we assume fields with zero prior mean, the statistics of h is entirely defined by the second order correlations C(s, S') E[h(s)h(S')], where E denotes expectations 310 MOpper and 0. Winther with respect to the prior. Interesting examples are C(s, s') (1) C(s, s') (2) The choice (1) can be motivated as a limit of a two-layered neural network with infinitely many hidden units with factorizable input-hidden weight priors (Williams 1997).
Bayesian Modeling of Human Concept Learning
I consider the problem of learning concepts from small numbers of positive examples,a feat which humans perform routinely but which computers arerarely capable of. Bridging machine learning and cognitive science perspectives, I present both theoretical analysis and an empirical study with human subjects for the simple task oflearning concepts corresponding toaxis-aligned rectangles in a multidimensional feature space. Existing learning models, when applied to this task, cannot explain how subjects generalize from only a few examples of the concept. I propose a principled Bayesian model based on the assumption that the examples are a random sample from the concept to be learned. The model gives precise fits to human behavior on this simple task and provides qualitati ve insights into more complex, realistic cases of concept learning.
Reports on the AAAI Fall Symposia
Giacomo, Giuseppe De, desJardins, Marie, Canamero, Dolores, Wasson, Glenn, Littman, Michael, Allwein, Gerard, Marriott, Kim, Meyer, Bernd, Webb, Barbara, Consi, Tom
The Association for the Advancement of Artificial Intelligence (AAAI) held its 1998 Fall Symposium Series on 23 to 25 October at the Omni Rosen Hotel in Orlando, Florida. This article contains summaries of seven of the symposia that were conducted: (1) Cognitive Robotics; (2) Distributed, Continual Planning; (3) Emotional and Intelligent: The Tangled Knot of Cognition; (4) Integrated Planning for Autonomous Agent Architectures; (5) Planning with Partially Observable Markov Decision Processes; (6) Reasoning with Visual and Diagrammatic Representations; and (7) Robotics and Biology: Developing Connections.
AI in Medicine: The Spectrum of Challenges from Managed Care to Molecular Medicine
AI has embraced medical applications from its inception, and some of the earliest work in successful application of AI technology occurred in medical contexts. Medicine in the twenty-first century will be very different than medicine in the late twentieth century. Fortunately, the technical challenges to AI that emerge are similar, and the prospects for success are high.
Inference in Bayesian Networks
A Bayesian network is a compact, expressive representation of uncertain relationships among parameters in a domain. In this article, I introduce basic methods for computing with Bayesian networks, starting with the simple idea of summing the probabilities of events of interest. The article introduces major current methods for exact computation, briefly surveys approximation methods, and closes with a brief discussion of open issues.