A Computational Model for Cursive Handwriting Based on the Minimization Principle

Neural Information Processing Systems

We propose a trajectory planning and control theory for continuous movements such as connected cursive handwriting and continuous natural speech. Its hardware is based on our previously proposed forward-inverse-relaxation neural network (Wada & Kawato, 1993). Computationally, its optimization principle is the minimum torquechange criterion.Regarding the representation level, hard constraints satisfied by a trajectory are represented as a set of via-points extracted from a handwritten character. Accordingly, we propose a via-point estimation algorithm that estimates via-points by repeating the trajectory formation of a character and the via-point extraction from the character. In experiments, good quantitative agreement is found between human handwriting data and the trajectories generated by the theory. Finally, we propose a recognition schema based on the movement generation. We show a result in which the recognition schema is applied to the handwritten character recognition and can be extended to the phoneme timing estimation of natural speech. 1 INTRODUCTION In reaching movements, trajectory formation is an ill-posed problem because the hand can move along an infinite number of possible trajectories from the starting to the target point.


Android phones can now read books, signs, business cards via Google's Mobile Vision

ZDNet

Google's Mobile Vision now gains the ability to read text. Google has introduced a new Text API for its Mobile Vision framework that allows Android developers to integrate optical-character recognition (OCR) into their apps. The new Text API appears in the recently-updated Google Play Services version 9.2, which restores Mobile Vision, Google's system to make it easy for developers to add facial detection and barcode-reading functionality to Android apps. The Text OCR technology currently can recognize text in any Latin-based language, covering most European languages, including English, German, and French, as well as Turkish. Google has added Word Lens, a technology acquired last year, to its Google Translate app.


Euler Sparse Representation for Image Classification

AAAI Conferences

Sparse representation based classification (SRC) has gained great success in image recognition. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may help improve the separability and margin between nearby data points, we propose Euler SRC for image classification, which is essentially the SRC with Euler sparse representation. To be specific, it first maps the images into the complex space by Euler representation, which has a negligible effect for outliers and illumination, and then performs complex SRC with Euler representation. The major advantage of our method is that Euler representation is explicit with no increase of the image space dimensionality, thereby enabling this technique to be easily deployed in real applications. To solve Euler SRC, we present an efficient algorithm, which is fast and has good convergence. Extensive experimental results illustrate that Euler SRC outperforms traditional SRC and achieves better performance for image classification.


Google Opens Cloud Vision API Beta to Entire Developer Community

#artificialintelligence

Today, Google announced the beta release of its Google Cloud Vision API. The API was designed to empower applications to both see and understand images submitted to the API. With powerful features such as label/entity detection, optical character recognition, safe search detection, facial detection, landmark detection, and logo detection; the Cloud Vision API gives applications unprecedented ability to comprehend the situation within an image. With the new API, Google enters a rapidly developing market where both startups and major enterprises are producing cutting edge technology. From Microsoft, with its Project Oxford, to niche startups like Cognitec and Lambda Labs; image analysis is proving to be an attractive space as it appeals across industries from marketing to security.


A Bayesian Approach to Perceptual 3D Object-Part Decomposition Using Skeleton-Based Representations

AAAI Conferences

We present a probabilistic approach to shape decomposition that creates a skeleton-based shape representation of a 3D object while simultaneously decomposing it into constituent parts. Our approach probabilistically combines two prominent threads from the shape literature: skeleton-based (medial axis) representations of shape, and part-based representations of shape, in which shapes are combinations of primitive parts. Our approach recasts skeleton-based shape representation as a mixture estimation problem, allowing us to apply probabilistic estimation techniques to the problem of 3D shape decomposition, extending earlier work on the 2D case. The estimated 3D shape decompositions approximate human shape decomposition judgments. We present a tractable implementation of the framework, which begins by over-segmenting objects at concavities, and then probabilistically merges them to create a distribution over possible decompositions. This results in a hierarchy of decompositions at different structural scales, again closely matching known properties of human shape representation. The probabilistic estimation procedures that arise naturally in the model allow effective prediction of missing parts. We present results on shapes from a standard database illustrating the effectiveness of the approach.