Goto

Collaborating Authors

Guyon, I.


Recognition-based Segmentation of On-Line Hand-printed Words

Neural Information Processing Systems

The input strings consist of a timeordered sequenceof XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. Thefirst method, which we call IN SEC (for input segmentation), usesa combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trainedrecognition engine for both recognizing characters and identifying relevant segmentation points. 1 INTRODUCTION We address the problem of writer independent recognition of hand-printed words from an 80,OOO-word English dictionary. Several levels of difficulty in the recognition of hand-printed words are illustrated in figure 1. The examples were extracted from our databases (table 1). Except in the cases of boxed or clearly spaced characters, segmenting characters independently of the recognition process yields poor recognition performance.This has motivated us to explore recognition-based segmentation techniques.



Recognition-based Segmentation of On-Line Hand-printed Words

Neural Information Processing Systems

The input strings consist of a timeordered sequence of XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. The first method, which we call IN SEC (for input segmentation), uses a combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trained recognition engine for both recognizing characters and identifying relevant segmentation points.




Recognition-based Segmentation of On-Line Hand-printed Words

Neural Information Processing Systems

The input strings consist of a timeordered sequence of XY coordinates, punctuated by pen-lifts. The methods were designed to work in "run-on mode" where there is no constraint on the spacing between characters. While both methods use a neural network recognition engine and a graph-algorithmic post-processor, their approaches to segmentation are quite different. The first method, which we call IN SEC (for input segmentation), uses a combination of heuristics to identify particular penlifts as tentative segmentation points. The second method, which we call OUTSEC (for output segmentation), relies on the empirically trained recognition engine for both recognizing characters and identifying relevant segmentation points.


Structural Risk Minimization for Character Recognition

Neural Information Processing Systems

The method of Structural Risk Minimization refers to tuning the capacity of the classifier to the available amount of training data. This capacity is influenced by several factors, including: (1) properties of the input space, (2) nature and structure of the classifier, and (3) learning algorithm. Actions based on these three factors are combined here to control the capacity of linear classifiers and improve generalization on the problem of handwritten digit recognition.


Structural Risk Minimization for Character Recognition

Neural Information Processing Systems

The method of Structural Risk Minimization refers to tuning the capacity of the classifier to the available amount of training data. This capacity is influenced by several factors, including: (1) properties of the input space, (2) nature and structure of the classifier, and (3) learning algorithm. Actions based on these three factors are combined here to control the capacity of linear classifiers and improve generalization on the problem of handwritten digit recognition.