A Computational Model for Cursive Handwriting Based on the Minimization Principle

Neural Information Processing Systems

We propose a trajectory planning and control theory for continuous movements such as connected cursive handwriting and continuous natural speech. Its hardware is based on our previously proposed forward-inverse-relaxation neural network (Wada & Kawato, 1993). Computationally, its optimization principle is the minimum torquechange criterion.Regarding the representation level, hard constraints satisfied by a trajectory are represented as a set of via-points extracted from a handwritten character. Accordingly, we propose a via-point estimation algorithm that estimates via-points by repeating the trajectory formation of a character and the via-point extraction from the character. In experiments, good quantitative agreement is found between human handwriting data and the trajectories generated by the theory. Finally, we propose a recognition schema based on the movement generation. We show a result in which the recognition schema is applied to the handwritten character recognition and can be extended to the phoneme timing estimation of natural speech. 1 INTRODUCTION In reaching movements, trajectory formation is an ill-posed problem because the hand can move along an infinite number of possible trajectories from the starting to the target point.


Android phones can now read books, signs, business cards via Google's Mobile Vision

ZDNet

Google's Mobile Vision now gains the ability to read text. Google has introduced a new Text API for its Mobile Vision framework that allows Android developers to integrate optical-character recognition (OCR) into their apps. The new Text API appears in the recently-updated Google Play Services version 9.2, which restores Mobile Vision, Google's system to make it easy for developers to add facial detection and barcode-reading functionality to Android apps. The Text OCR technology currently can recognize text in any Latin-based language, covering most European languages, including English, German, and French, as well as Turkish. Google has added Word Lens, a technology acquired last year, to its Google Translate app.


Google Opens Cloud Vision API Beta to Entire Developer Community

#artificialintelligence

Today, Google announced the beta release of its Google Cloud Vision API. The API was designed to empower applications to both see and understand images submitted to the API. With powerful features such as label/entity detection, optical character recognition, safe search detection, facial detection, landmark detection, and logo detection; the Cloud Vision API gives applications unprecedented ability to comprehend the situation within an image. With the new API, Google enters a rapidly developing market where both startups and major enterprises are producing cutting edge technology. From Microsoft, with its Project Oxford, to niche startups like Cognitec and Lambda Labs; image analysis is proving to be an attractive space as it appeals across industries from marketing to security.


A polynomial-time relaxation of the Gromov-Hausdorff distance

arXiv.org Machine Learning

The Gromov-Hausdorff distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately, computing this metric directly is believed to be computationally intractable. Motivated by applications in shape matching and point-cloud comparison, we study a semidefinite programming relaxation of the Gromov-Hausdorff metric. This relaxation can be computed in polynomial time, and somewhat surprisingly is itself a pseudometric. We describe the induced topology on the set of compact metric spaces. Finally, we demonstrate the numerical performance of various algorithms for computing the relaxed distance and apply these algorithms to several relevant data sets. In particular we propose a greedy algorithm for finding the best correspondence between finite metric spaces that can handle hundreds of points.


Euler Sparse Representation for Image Classification

AAAI Conferences

Sparse representation based classification (SRC) has gained great success in image recognition. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may help improve the separability and margin between nearby data points, we propose Euler SRC for image classification, which is essentially the SRC with Euler sparse representation. To be specific, it first maps the images into the complex space by Euler representation, which has a negligible effect for outliers and illumination, and then performs complex SRC with Euler representation. The major advantage of our method is that Euler representation is explicit with no increase of the image space dimensionality, thereby enabling this technique to be easily deployed in real applications. To solve Euler SRC, we present an efficient algorithm, which is fast and has good convergence. Extensive experimental results illustrate that Euler SRC outperforms traditional SRC and achieves better performance for image classification.