Goto

Collaborating Authors

 Bayesian Learning


Probabilistic Anomaly Detection in Dynamic Systems

Neural Information Processing Systems

Padhraic Smyth Jet Propulsion Laboratory 238-420 California Institute of Technology 4800 Oak Grove Drive Pasadena, CA 91109 Abstract This paper describes probabilistic methods for novelty detection when using pattern recognition methods for fault monitoring of dynamic systems. The problem of novelty detection is particularly acutewhen prior knowledge and training data only allow one to construct an incomplete classification model. Allowance must be made in model design so that the classifier will be robust to data generated by classes not included in the training phase. For diagnosis applications one practical approach is to construct both an input density model and a discriminative class model. Using Bayes' rule and prior estimates of the relative likelihood of data of known and unknown origin the resulting classification equations are straightforward.


Bayesian Self-Organization

Neural Information Processing Systems

Smirnakis Lyman Laboratory of Physics Harvard University Cambridge, MA 02138 Lei Xu * Dept. of Computer Science HSH ENG BLDG, Room 1006 The Chinese University of Hong Kong Shatin, NT Hong Kong Abstract Recent work by Becker and Hinton (Becker and Hinton, 1992) shows a promising mechanism, based on maximizing mutual information assumingspatial coherence, by which a system can selforganize itself to learn visual abilities such as binocular stereo. We introduce a more general criterion, based on Bayesian probability theory, and thereby demonstrate a connection to Bayesian theories ofvisual perception and to other organization principles for early vision (Atick and Redlich, 1990). Methods for implementation usingvariants of stochastic learning are described and, for the special case of linear filtering, we derive an analytic expression for the output. 1 Introduction The input intensity patterns received by the human visual system are typically complicated functions of the object surfaces and light sources in the world. It *Lei Xu was a research scholar in the Division of Applied Sciences at Harvard University while this work was performed. Thus the visual system must be able to extract information from the input intensities that is relatively independent of the actual intensity values.


Putting It All Together: Methods for Combining Neural Networks

Neural Information Processing Systems

In solving these tasks, one is faced with a large variety of learning algorithms and a vast selection of possible network architectures. After all the training, how does one know which is the best network? This decision is further complicated by the fact that standard techniques can be severely limited by problems such as over-fitting, data sparsity and local optima. The usual solution to these problems is a winner-take-all cross-validatory model selection. However, recent experimental and theoretical work indicates that we can improve performance by considering methods for combining neural networks.


Bayesian Modeling and Classification of Neural Signals

Neural Information Processing Systems

Signal processing and classification algorithms often have limited applicability resulting from an inaccurate model of the signal's underlying structure.We present here an efficient, Bayesian algorithm for modeling a signal composed of the superposition of brief, Poisson-distributed functions. This methodology is applied to the specific problem of modeling and classifying extracellular neural waveforms which are composed of a superposition of an unknown number of action potentials CAPs). Previous approaches have had limited success due largely to the problems of determining the spike shapes, deciding how many are shapes distinct, and decomposing overlapping APs. A Bayesian solution to each of these problems is obtained by inferring a probabilistic model of the waveform. This approach quantifies the uncertainty of the form and number of the inferred AP shapes and is used to obtain an efficient method for decomposing complex overlaps. This algorithm can extract many times more information than previous methods and facilitates the extracellular investigation of neuronal classes and of interactions within neuronal circuits.


Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data

Neural Information Processing Systems

I propose a learning algorithm for learning hierarchical models for object recognition.The model architecture is a compositional hierarchy that represents part-whole relationships: parts are described in the local contextof substructures of the object. The focus of this report is learning hierarchical models from data, i.e. inducing the structure of model prototypes from observed exemplars of an object. At each node in the hierarchy, a probability distribution governing its parameters must be learned. The connections between nodes reflects the structure of the object. The formulation of substructures is encouraged such that their parts become conditionally independent.


Bayesian Backprop in Action: Pruning, Committees, Error Bars and an Application to Spectroscopy

Neural Information Processing Systems

MacKay's Bayesian framework for backpropagation is conceptually appealing as well as practical. It automatically adjusts the weight decay parameters during training, and computes the evidence for each trained network. The evidence is proportional to our belief in the model. In this paper, the framework is extended to pruned nets, leading to an Ockham Factor for "tuning the architecture to the data". A committee of networks, selected by their high evidence, is a natural Bayesian construction.


Research Issues in Qualitative and Abstract Probability

AI Magazine

To assess the state of the art and identify issues requiring further investigation, a workshop on qualitative and abstract probability was held during the third week of November 1993. This workshop brought together a mix of active researchers from academia, industry, and government interested in the practical and theoretical impact of these abstractions on techniques, methods, and tools for solving complex AI tasks. The result was a set of specific recommendations on the most promising and important avenues for future research.


Operations for Learning with Graphical Models

Journal of Artificial Intelligence Research

This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Well-known examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, andthe manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximizationalgorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feed-forward networks, and learning Gaussian and discrete Bayesian networks from data. The paper concludes by sketching some implications for data analysis and summarizing how some popular algorithms fall within the framework presented. The main original contributions here are the decompositiontechniques and the demonstration that graphical models provide a framework for understanding and developing complex learning algorithms.


Bayesian Learning via Stochastic Dynamics

Neural Information Processing Systems

The attempt to find a single "optimal" weight vector in conventional network training can lead to overfitting and poor generalization. Bayesian methods avoid this, without the need for a validation set, by averaging the outputs of many networks with weights sampled from the posterior distribution given the training data. This sample can be obtained by simulating a stochastic dynamical system that has the posterior as its stationary distribution.


Transient Signal Detection with Neural Networks: The Search for the Desired Signal

Neural Information Processing Systems

Matched filtering has been one of the most powerful techniques employed for transient detection. Here we will show that a dynamic neural network outperforms the conventional approach. When the artificial neural network (ANN) is trained with supervised learning schemes there is a need to supply the desired signal for all time, although we are only interested in detecting the transient. In this paper we also show the effects on the detection agreement of different strategies to construct the desired signal. The extension of the Bayes decision rule (011 desired signal), optimal in static classification, performs worse than desired signals constructed by random noise or prediction during the background.