There is increasing interest in building systems that can automatically interpret hand-drawn sketches. However, many challenges remain in terms of recognition accuracy, robustness to different drawing styles, and ability to generalize across multiple domains. To address these challenges, we propose a new approach to sketched symbol recognition that focuses on the visual appearance of the symbols. This allows us to better handle the range of visual and stroke-level variations found in freehand drawings. We also present a new symbol classifier that is computationally efficient and invariant to rotation and local deformations. We show that our method exceeds state-of-the-art performance on all three domains we evaluated, including handwritten digits, PowerPoint shapes, and electrical circuit symbols.
In this paper, we target at the problem of sketch recognition. We systematically study how to incorporate users' correction and editing into isolated and full sketch recognition. This is a natural and necessary interaction in real systems such as Visio where very similar shapes exist. First, a novel algorithm is proposed to mine the prior shape knowledge for three editing modes. Second, to differentiate visually similar shapes, a novel symbol recognition algorithm is introduced by leveraging the learnt shape knowledge. Then, a novel editing detection algorithm is proposed to facilitate symbol recognition. Furthermore, both of the symbol recognizer and the editing detector are systematically incorporated into the full sketch recognition. Finally, based on the proposed algorithms, a real-time sketch recognition system is built to recognize hand-drawn flowcharts and diagrams with flexible interactions. Extensive experiments show the effectiveness of the proposed algorithms.
Objects in freely-drawn sketches often have no spatial or temporal separation, making object recognition difficult. We present a two-step stroke-grouping algorithm that first classifies individual strokes according to the type of object to which they belong, then groups strokes with like classifications into clusters representing individual objects. The first step facilitates clustering by naturally separating the strokes, and both steps fluidly integrate spatial and temporal information. Our approach to grouping is unique in its formulation as an efficient classification task rather than, for example, an expensive search task. Our single-stroke classifier performs at least as well as existing single-stroke classifiers on text vs. nontext classification, and we present the first three-way single-stroke classification results. Our stroke grouping results are the first reported of their kind; our grouping algorithm correctly groups between 86% and 91% of the ink in diagrams from two domains, with between 69% and 79% of shapes being perfectly clustered.
Most sketch recognition systems are accurate in recognizing either text or shape (graphic) ink strokes, but not both. Distinguishing between shape and text strokes is, therefore, a critical task in recognizing hand drawn digital ink diagrams which commonly contain many text labels and annotations. We have found the ‘entropy rate’ to be an accurate criterion of classification. We found that the entropy rate is significantly higher for text strokes compared to shape strokes and can serve as a distinguishing factor between the two. Using entropy values, our system produced a correct classification rate of 92.06% on test data belonging to diagrammatic domain for which the threshold was trained on. It also performed favorably on data for which no training examples at all were supplied.
In this work, we target at the problem of offline sketch parsing, in which the temporal orders of strokes are unavailable. It is more challenging than most of existing work, which usually leverages the temporal information to reduce the search space. Different from traditional approaches in which thousands of candidate groups are selected for recognition, we propose the idea of shapeness estimation to greatly reduce this number in a very fast way. Based on the observation that most of hand-drawn shapes with well-defined closed boundaries can be clearly differentiated from non-shapes if normalized into a very small size, we propose an efficient shapeness estimation method. A compact feature representation as well as its efficient extraction method is also proposed to speed up this process. Based on the proposed shapeness estimation, we present a three-stage cascade framework for offline sketch parsing. The shapeness estimation technique in this framework greatly reduces the number of false positives, resulting in a 96.2% detection rate with only 32 candidate group proposals, which is two orders of magnitude less than existing methods. Extensive experiments show the superiority of the proposed framework over state-of-the-art works on sketch parsing in both effectiveness and efficiency, even though they leveraged the temporal information of strokes.