Learning Graphical Models
Hidden Markov Models for Human Genes
Baldi, Pierre, Brunak, Søren, Chauvin, Yves, Engelbrecht, Jacob, Krogh, Anders
We apply HMMs to the problem of modeling exons, intronsand detecting splice sites in the human genome. Our most interesting result so far is the detection of particular oscillatory patterns,with a minimal period ofroughly 10 nucleotides, that seem to be characteristic of exon regions and may have significant biological implications.
Bayesian Self-Organization
Yuille, Alan L., Smirnakis, Stelios M., Xu, Lei
Smirnakis Lyman Laboratory of Physics Harvard University Cambridge, MA 02138 Lei Xu * Dept. of Computer Science HSH ENG BLDG, Room 1006 The Chinese University of Hong Kong Shatin, NT Hong Kong Abstract Recent work by Becker and Hinton (Becker and Hinton, 1992) shows a promising mechanism, based on maximizing mutual information assumingspatial coherence, by which a system can selforganize itself to learn visual abilities such as binocular stereo. We introduce a more general criterion, based on Bayesian probability theory, and thereby demonstrate a connection to Bayesian theories ofvisual perception and to other organization principles for early vision (Atick and Redlich, 1990). Methods for implementation usingvariants of stochastic learning are described and, for the special case of linear filtering, we derive an analytic expression for the output. 1 Introduction The input intensity patterns received by the human visual system are typically complicated functions of the object surfaces and light sources in the world. It *Lei Xu was a research scholar in the Division of Applied Sciences at Harvard University while this work was performed. Thus the visual system must be able to extract information from the input intensities that is relatively independent of the actual intensity values.
Putting It All Together: Methods for Combining Neural Networks
In solving these tasks, one is faced with a large variety of learning algorithms and a vast selection of possible network architectures. After all the training, how does one know which is the best network? This decision is further complicated by the fact that standard techniques can be severely limited by problems such as over-fitting, data sparsity and local optima. The usual solution to these problems is a winner-take-all cross-validatory model selection. However, recent experimental and theoretical work indicates that we can improve performance by considering methods for combining neural networks.
Figure of Merit Training for Detection and Spotting
Chang, Eric I., Lippmann, Richard P.
Spotting tasks require detection of target patterns from a background of richly varied non-target inputs. The performance measure of interest for these tasks, called the figure of merit (FOM), is the detection rate for target patterns when the false alarm rate is in an acceptable range. A new approach to training spotters is presented which computes the FOM gradient for each input pattern and then directly maximizes the FOM using backpropagation. This eliminates the need for thresholds during training. It also uses network resources to model Bayesian a posteriori probability functions accurately only for patterns which have a significant effect on the detection accuracy over the false alarm rate of interest.
Resolving motion ambiguities
Diamantaras, K. I., Geiger, D.
We address the problem of optical flow reconstruction and in particular theproblem of resolving ambiguities near edges. They occur due to (i) the aperture problem and (ii) the occlusion problem, where pixels on both sides of an intensity edge are assigned the same velocity estimates (and confidence). However, these measurements are correct for just one side of the edge (the non occluded one). Our approach is to introduce an uncertamty field with respect to the estimates and confidence measures. We note that the confidence measuresare large at intensity edges and larger at the convex sides of the edges, i.e. inside corners, than at the concave side. We resolve the ambiguities through local interactions via coupled Markov random fields (MRF). The result is the detection of motion for regions of images with large global convexity.
Globally Trained Handwritten Word Recognizer using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models
Bengio, Yoshua, LeCun, Yann, Henderson, Donnie
We introduce a new approach for online recognition of handwritten wordswritten in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are then coded into low resolution "annotated images" where each pixel contains information abouttrajectory direction and curvature. The recognizer is a convolution network which can be spatially replicated. From the network output, a hidden Markov model produces word scores.
Mixtures of Controllers for Jump Linear and Non-Linear Plants
Cacciatore, Timothy W., Nowlan, Steven J.
To control such complex systems it is computationally moreefficient to decompose the problem into smaller subtasks, with different control strategies for different operating points. When detailed information about the plant is available, gain scheduling has proven a successful method for designing a global control (Shamma and Athans, 1992). The system is partitioned by choosing several operating points and a linear model for each operating point. A controller is designed for each linear model and a method for interpolating or'scheduling' the gains of the controllers is chosen. The control problem becomes even more challenging when the system to be controlled isnon-stationary, and the mode of the system is not explicitly observable.
Bayesian Modeling and Classification of Neural Signals
Signal processing and classification algorithms often have limited applicability resulting from an inaccurate model of the signal's underlying structure.We present here an efficient, Bayesian algorithm for modeling a signal composed of the superposition of brief, Poisson-distributed functions. This methodology is applied to the specific problem of modeling and classifying extracellular neural waveforms which are composed of a superposition of an unknown number of action potentials CAPs). Previous approaches have had limited success due largely to the problems of determining the spike shapes, deciding how many are shapes distinct, and decomposing overlapping APs. A Bayesian solution to each of these problems is obtained by inferring a probabilistic model of the waveform. This approach quantifies the uncertainty of the form and number of the inferred AP shapes and is used to obtain an efficient method for decomposing complex overlaps. This algorithm can extract many times more information than previous methods and facilitates the extracellular investigation of neuronal classes and of interactions within neuronal circuits.
Learning in Compositional Hierarchies: Inducing the Structure of Objects from Data
I propose a learning algorithm for learning hierarchical models for object recognition.The model architecture is a compositional hierarchy that represents part-whole relationships: parts are described in the local contextof substructures of the object. The focus of this report is learning hierarchical models from data, i.e. inducing the structure of model prototypes from observed exemplars of an object. At each node in the hierarchy, a probability distribution governing its parameters must be learned. The connections between nodes reflects the structure of the object. The formulation of substructures is encouraged such that their parts become conditionally independent.
Bayesian Backprop in Action: Pruning, Committees, Error Bars and an Application to Spectroscopy
MacKay's Bayesian framework for backpropagation is conceptually appealing as well as practical. It automatically adjusts the weight decay parameters during training, and computes the evidence for each trained network. The evidence is proportional to our belief in the model. In this paper, the framework is extended to pruned nets, leading to an Ockham Factor for "tuning the architecture to the data". A committee of networks, selected by their high evidence, is a natural Bayesian construction.