Plotting

 Rumelhart, David E.







A Self-Organizing Integrated Segmentation and Recognition Neural Net

Neural Information Processing Systems

Standard pattern recognition systems usually involve a segmentation step prior to the recognition step. For example, it is very common in character recognition to segment characters in a pre-processing step then normalize the individual characters and pass them to a recognition engine such as a neural network, as in the work of LeCun et al. 1988, Martin and Pittman (1988). This separation between segmentation and recognition becomes unreliable if the characters are touching each other, touching bounding boxes, broken, or noisy. Other applications such as scene analysis or continuous speech recognition pose similar and more severe segmentation problems. The difficulties encountered in these applications present an apparent dilemma: one cannot recognize the patterns *keeler@mcc.com


A Self-Organizing Integrated Segmentation and Recognition Neural Net

Neural Information Processing Systems

Standard pattern recognition systems usually involve a segmentation step prior to the recognition step. For example, it is very common in character recognition to segment characters in a pre-processing step then normalize the individual characters and pass them to a recognition engine such as a neural network, as in the work of LeCun et al. 1988, Martin and Pittman (1988). This separation between segmentation and recognition becomes unreliable if the characters are touching each other, touching bounding boxes, broken, or noisy. Other applications such as scene analysis or continuous speech recognition pose similar and more severe segmentation problems. The difficulties encountered in these applications present an apparent dilemma: one cannot recognize the patterns 496 *keeler@mcc.comReprint


A Self-Organizing Integrated Segmentation and Recognition Neural Net

Neural Information Processing Systems

Standard pattern recognition systems usually involve a segmentation step prior to the recognition step. For example, it is very common in character recognition to segment characters in a pre-processing step then normalize the individual characters and pass them to a recognition engine such as a neural network, as in the work of LeCun et al. 1988, Martin and Pittman (1988). This separation between segmentation and recognition becomes unreliable if the characters are touching each other, touching bounding boxes, broken, or noisy. Other applications such as scene analysis or continuous speech recognition pose similar and more severe segmentation problems. The difficulties encountered in these applications present an apparent dilemma: one cannot recognize the patterns *keeler@mcc.com


Generalization by Weight-Elimination with Application to Forecasting

Neural Information Processing Systems

Bernardo A. Huberman Dynamics of Computation XeroxPARC Palo Alto, CA 94304 Inspired by the information theoretic idea of minimum description length, we add a term to the back propagation cost function that penalizes network complexity. We give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. We use this procedure to predict the sunspot time series and the notoriously noisy series of currency exchange rates. 1 INTRODUCTION Learning procedures for connectionist networks are essentially statistical devices for performing inductiveinference. There is a tradeoff between two goals: on the one hand, we want such devices to be as general as possible so that they are able to learn a broad range of problems.


Generalization by Weight-Elimination with Application to Forecasting

Neural Information Processing Systems

Inspired by the information theoretic idea of minimum description length, we add a term to the back propagation cost function that penalizes network complexity. We give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. We use this procedure to predict the sunspot time series and the notoriously noisy series of currency exchange rates. 1 INTRODUCTION Learning procedures for connectionist networks are essentially statistical devices for performing inductive inference. There is a tradeoff between two goals: on the one hand, we want such devices to be as general as possible so that they are able to learn a broad range of problems.