Bischel, David Tyler (University of California, Riverside) | Stahovich, Thomas F. (University of California, Riverside) | Davis, Randall (Massachusetts Institute of Technology) | Adler, Aaron (Massachusetts Institute of Technology) | Peterson, Eric J. (University of California, Riverside)
Mechanical design tools would be considerably more useful if we could interact with them in the way that human designers communicate design ideas to one another, i.e., using crude sketches and informal speech. Those crude sketches frequently contain pen strokes of two different sorts, one type portraying device structure, the other denoting gestures, such as arrows used to indicate motion. We report here on techniques we developed that use information from both sketch and speech to distinguish gesture strokes from non-gestures -- a critical first step in understanding a sketch of a device. We collected and analyzed unconstrained device descriptions, which revealed six common types of gestures. Guided by this knowledge, we developed a classifier that uses both sketch and speech features to distinguish gesture strokes from non-gestures. Experiments with our techniques indicate that the sketch and speech modalities alone produce equivalent classification accuracy, but combining them produces higher accuracy.
Current feature-based methods for sketch recognition systems rely on human-selected features. Certain machine learning techniques have been found to be good nonlinear features extractors. In this paper, we apply a manifold learning method, kernel Isomap, with a new algorithm for multi-stroke sketch recognition, which significantly outperforms the standard featurebased techniques.
Most sketch recognition systems are accurate in recognizing either text or shape (graphic) ink strokes, but not both. Distinguishing between shape and text strokes is, therefore, a critical task in recognizing hand drawn digital ink diagrams which commonly contain many text labels and annotations. We have found the'entropy rate' to be an accurate criterion of classification. We found that the entropy rate is significantly higher for text strokes compared to shape strokes and can serve as a distinguishing factor between the two. Using entropy values, our system produced a correct classification rate of 92.06% on test data belonging to diagrammatic domain for which the threshold was trained on. It also performed favorably on data for which no training examples at all were supplied.
Unlike English, where unfamiliar words can be queried for its meaning by typing out its letters, the analogous operation in Chinese is far from trivial due to the nature of its written language. One approach for querying Chinese characters involve referencing their dictionary component called radicals. This is advantageous since users would not need to know their pronunciation nor their stroke-order, a requirement in other querying approaches. Currently though, sketching a character's radical for querying is an unsupported capability in existing systems. Using the geometric-based LADDER sketching language combined with the Sezgin lowlevel recognizer, we were able to construct an application which can first recognize handwritten sketches of Chinese radical, and then output candidate Chinese characters which contain that radical. Thus, we were able to demonstrate that a geometric-based sketch recognition approach can be used to easily build applications for recognizing symbols related to Chinese characters while having reasonable recognition rates. Unlike current image-based recognition systems, our system also maintains stroke order information of characters. Since stroke order is important in written Chinese, our system can be easily expanded for use in Chinese language education by providing visual feedback to students on correct stroke order.