AITopics | Text Recognition

Collaborating Authors

Text Recognition

News Overviews Instructional Materials AI-Alerts Classics

Text Recognition for Video in Microsoft Video Indexer

#artificialintelligenceMar-20-2018, 18:08:08 GMT

In Video Indexer, we have the capability for recognizing display text in videos. This blog explains some of the techniques we used to extract the best quality data. To start, take a look at the sequence of frames below. Did you manage to recognize the text in the images? It is highly reasonable that you did, without even noticing.

artificial intelligence, machine learning, video, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.41)

Add feedback

SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network

Liu, Zichuan (Nanyang Technological University) | Li, Yixing (Arizona State University) | Ren, Fengbo (Arizona State University) | Goh, Wang Ling (Nanyang Technological University) | Yu, Hao (Southern University of Science and Technology)

AAAI ConferencesFeb-8-2018

A new approach for real-time scene text recognition is proposed in this paper. A novel binary convolutional encoder-decoder network (B-CEDNet) together with a bidirectional recurrent neural network (Bi-RNN). The B-CEDNet is engaged as a visual front-end to provide elaborated character detection, and a back-end Bi-RNN performs character-level sequential correction and classification based on learned contextual knowledge. The front-end B-CEDNet can process multiple regions containing characters using a one-off forward operation, and is trained under binary constraints with significant compression. Hence it leads to both remarkable inference run-time speedup as well as memory usage reduction. With the elaborated character detection, the back-end Bi-RNN merely processes a low dimension feature sequence with category and spatial information of extracted characters for sequence correction and classification. By training with over 1,000,000 synthetic scene text images, the B-CEDNet achieves a recall rate of 0.86, precision of 0.88 and F-score of 0.87 on ICDAR-03 and ICDAR-13. With the correction and classification by Bi-RNN, the proposed real-time scene text recognition achieves state-of-the-art accuracy while only consumes less than 1-ms inference run-time. The flow processing flow is realized on GPU with a small network size of 1.01 MB for B-CEDNet and 3.23 MB for Bi-RNN, which is much faster and smaller than the existing solutions.

deep learning, neural network, text recognition, (20 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.14)
Asia (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.84)

Add feedback

Text detection API showdown: Google vision vs Microsoft Vs Amazon

#artificialintelligenceDec-28-2017, 04:07:42 GMT

Detecting and reading text from photos has multiple use cases, be it clicking a picture of a printed text and automatically converting it into a digital file or the new age application of reading bills and invoices.

api, artificial intelligence, machine learning, (12 more...)

#artificialintelligence

Industry: Information Technology (0.56)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.46)

Add feedback

Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data

Lou, Xinghua, Kansky, Ken, Lehrach, Wolfgang, Laan, CC, Marthi, Bhaskara, Phoenix, D., George, Dileep

Neural Information Processing SystemsDec-31-2016

Abstract: We demonstrate that a generative model for object shapes can achieve state of the art results on challenging scene text recognition tasks, and with orders ofmagnitude fewer training images than required for competing discriminative methods.In addition to transcribing text from challenging images, our method performs fine-grained instance segmentation of characters. We show that our model is more robust to both affine transformations and non-affine deformations comparedto previous approaches.

artificial intelligence, font, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.63)

Add feedback