Learning Graphical Models
Model Based Image Compression and Adaptive Data Representation by Interacting Filter Banks
Okamoto, Toshiaki, Kawato, Mitsuo, Inui, Toshio, Miyake, Sei
To achieve high-rate image data compression while maintainig a high quality reconstructed image, a good image model and an efficient way to represent the specific data of each image must be introduced. Based on the physiological knowledge of multi - channel characteristics and inhibitory interactions between them in the human visual system, a mathematically coherent parallel architecture for image data compression which utilizes the Markov random field Image model and interactions between a vast number of filter banks, is proposed.
Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters
One of the attractions of neural network approaches to pattern recognition is the use of a discrimination-based training method. We show that once we have modified the output layer of a multilayer perceptron to provide mathematically correct probability distributions, and replaced the usual squared error criterion with a probability-based score, the result is equivalent to Maximum Mutual Information training, which has been used successfully to improve the performance of hidden Markov models for speech recognition. If the network is specially constructed to perform the recognition computations of a given kind of stochastic model based classifier then we obtain a method for discrimination-based training of the parameters of the models. Examples include an HMM-based word discriminator, which we call an'Alphanet'.
HMM Speech Recognition with Neural Net Discrimination
Huang, William Y., Lippmann, Richard P.
Two approaches were explored which integrate neural net classifiers with Hidden Markov Model (HMM) speech recognizers. Both attempt to improve speech pattern discrimination while retaining the temporal processing advantages of HMMs. One approach used neural nets to provide second-stage discrimination following an HMM recognizer. On a small vocabulary task, Radial Basis Function (RBF) and back-propagation neural nets reduced the error rate substantially (from 7.9% to 4.2% for the RBF classifier). In a larger vocabulary task, neural net classifiers did not reduce the error rate. They, however, outperformed Gaussian, Gaussian mixture, and k nearest neighbor (KNN) classifiers. In another approach, neural nets functioned as low-level acoustic-phonetic feature extractors. When classifying phonemes based on single 10 msec.
HMM Speech Recognition with Neural Net Discrimination
Huang, William Y., Lippmann, Richard P.
Two approaches were explored which integrate neural net classifiers with Hidden Markov Model (HMM) speech recognizers. Both attempt to improve speech pattern discrimination while retaining the temporal processing advantages of HMMs. One approach used neural nets to provide second-stage discrimination following an HMM recognizer. On a small vocabulary task, Radial Basis Function (RBF) and back-propagation neural nets reduced the error rate substantially (from 7.9% to 4.2% for the RBF classifier). In a larger vocabulary task, neural net classifiers did not reduce the error rate. They, however, outperformed Gaussian, Gaussian mixture, and k nearest neighbor (KNN) classifiers. In another approach, neural nets functioned as low-level acoustic-phonetic feature extractors. When classifying phonemes based on single 10 msec.
Coupled Markov Random Fields and Mean Field Theory
Geiger, Davi, Girosi, Federico
In recent years many researchers have investigated the use of Markov Random Fields (MRFs) for computer vision. They can be applied for example to reconstruct surfaces from sparse and noisy depth data coming from the output of a visual process, or to integrate early vision processes to label physical discontinuities. In this paper we show that by applying mean field theory to those MRFs models a class of neural networks is obtained. Those networks can speed up the solution for the MRFs models. The method is not restricted to computer vision. 1 Introduction
Maximum Likelihood Competitive Learning
One popular class of unsupervised algorithms are competitive algorithms. In the traditional view of competition, only one competitor, the winner, adapts for any given case. I propose to view competitive adaptation as attempting to fit a blend of simple probability generators (such as gaussians) to a set of data-points. The maximum likelihood fit of a model of this type suggests a "softer" form of competition, in which all competitors adapt in proportion to the relative probability that the input came from each competitor. I investigate one application of the soft competitive model, placement of radial basis function centers for function interpolation, and show that the soft model can give better performance with little additional computational cost. 1 INTRODUCTION Interest in unsupervised learning has increased recently due to the application of more sophisticated mathematical tools (Linsker, 1988; Plumbley and Fallside, 1988; Sanger, 1989) and the success of several elegant simulations of large scale selforganization (Linsker, 1986; Kohonen, 1982). One popular class of unsupervised algorithms are competitive algorithms, which have appeared as components in a variety of systems (Von der Malsburg, 1973; Fukushima, 1975; Grossberg, 1978). Generalizing the definition of Rumelhart and Zipser (1986), a competitive adaptive system consists of a collection of modules which are structurally identical except, possibly, for random initial parameter variation.
Bayesian Inference of Regular Grammar and Markov Source Models
Smith, Kurt R., Miller, Michael I.
In this paper we develop a Bayes criterion which includes the Rissanen complexity, for inferring regular grammar models. We develop two methods for regular grammar Bayesian inference. The fIrst method is based on treating the regular grammar as a I-dimensional Markov source, and the second is based on the combinatoric characteristics of the regular grammar itself. We apply the resulting Bayes criteria to a particular example in order to show the efficiency of each method.