Text Recognition
Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data
Lou, Xinghua, Kansky, Ken, Lehrach, Wolfgang, Laan, CC, Marthi, Bhaskara, Phoenix, D., George, Dileep
We demonstrate that a generative model for object shapes can achieve state of the art results on challenging scene text recognition tasks, and with orders of magnitude fewer training images than required for competing discriminative methods. In addition to transcribing text from challenging images, our method performs fine-grained instance segmentation of characters. We show that our model is more robust to both affine transformations and non-affine deformations compared to previous approaches. Papers published at the Neural Information Processing Systems Conference.
Build a Handwritten Text Recognition System using TensorFlow
Offline Handwritten Text Recognition (HTR) systems transcribe text contained in scanned images into digital text, an example is shown in Figure 1. We will build a Neural Network (NN) which is trained on word-images from the IAM dataset. As the input layer (and therefore also all the other layers) can be kept small for word-images, NN-training is feasible on the CPU (of course, a GPU would be better). This implementation is the bare minimum that is needed for HTR using TF. We use a NN for our task.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.62)
- Information Technology > Artificial Intelligence > Vision > Handwriting Recognition (0.61)
Machine Learning with Python: NLP and Text Recognition
Student and freelance AI / Big Data Developer with a passion for full stack. In this article, I apply a series of natural language processing techniques on a dataset containing reviews about businesses. After that, I train a model using Logistic Regression to forecast if a review is "positive" or "negative". The natural language processing field contains a series of tools that are very useful to extract, label, and forecast information starting from raw text data. This collection of techniques are mainly used in the field of emotions recognition, text tagging (for example to automatize the process of sorting complaints from a client), chatbots, and vocal assistants.
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.40)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)
ML Kit Android: Implementing Text Recognition -- Firebase
Firebase is now set up, we can now start building our Text Recognition app. We need Firebase ML Vision dependency, we add it in our app-level build.grade After capturing the image from the camera, we'll set the image into the ImageView as: Our app is ready to use. Run the app and click on the camera icon to launch the camera on your Android Device. Click a picture of some text, then click on tick icon and watch Firebase do the magic for you.
- Information Technology > Communications > Mobile (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.66)
Dropbox text recognition makes it easier to find images and PDFs
There's nothing worse than having to pore over a pile of PDFs containing documents scanned as images when you quickly have to find a specific file. Dropbox is making it easier to do that by introducing automatic image recognition, which extracts texts from photos and PDFs and makes them searchable. According to the cloud storage provider, there are 20 billion image and PDF files stored on Dropbox. Around 10 to 20 percent of those are photos of documents, so the new feature can be very, very useful. To look for a specific photo or PDF, you simply have to type in a keyword or phrase like you would on a search engine.
- Information Technology > Communications > Collaboration (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.44)
Facebook is making AI that can identify offensive memes
Facebook's moderators can't possibly look through every single image that gets posted on the enormous platform, so Facebook is building AI to help them out. In a blog post today, Facebook describes a system it's built called Rosetta that uses machine learning to identify text in images and videos and then transcribe it into something that's machine readable. In particular, Facebook is finding this tool helpful for transcribing the text on memes. Text transcription tools are nothing new, but Facebook faces different challenges because of the size of its platform and the variety of the images it sees. Rosetta is said to be live now, extracting text from 1 billion images and video frames per day across both Facebook and Instagram.
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.40)
Double Supervised Network with Attention Mechanism for Scene Text Recognition
Gao, Yuting, Huang, Zheng, Dai, Yuchen
In this paper, we propose Double Supervised Network with Attention Mechanism (DSAN), a novel end-to-end trainable framework for scene text recognition. It incorporates one text attention module during feature extraction which enforces the model to focus on text regions and the whole framework is supervised by two branches. One supervision branch comes from context-level modelling and another comes from one extra supervision enhancement branch which aims at tackling inexplicit semantic information at character level. These two supervisions can benefit each other and yield better performance. The proposed approach can recognize text in arbitrary length and does not need any predefined lexicon. Our method outperforms the current state-of-the-art methods on three text recognition benchmarks: IIIT5K, ICDAR2013 and SVT reaching accuracy 88.6%, 92.3% and 84.1% respectively which suggests the effectiveness of the proposed method.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.87)
STN-OCR: A single Neural Network for Text Detection and Text Recognition
STN-OCR, a single semi-supervised Deep Neural Network(DNN), consist of a spatial transformer network -- which is used to detected text regions in images, and a text recognition network -- which recognizes the textual content of the identified text regions. STN-OCR is an end-to-end scene text recognition system, but it is not easy to train. This model is mostly able to detect text in differently arranged lines of text in images, while also recognizing the content of these words. The overview of the system is shown in Figure 1. Compared with most of the current text recognition systems, which extract all the information from the image at once, STN-OCR behaves more like a human.
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Text Recognition for Video in Microsoft Video Indexer
In Video Indexer, we have the capability for recognizing display text in videos. This blog explains some of the techniques we used to extract the best quality data. To start, take a look at the sequence of frames below. Did you manage to recognize the text in the images? It is highly reasonable that you did, without even noticing.
SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network
Liu, Zichuan (Nanyang Technological University) | Li, Yixing (Arizona State University) | Ren, Fengbo (Arizona State University) | Goh, Wang Ling (Nanyang Technological University) | Yu, Hao (Southern University of Science and Technology)
A new approach for real-time scene text recognition is proposed in this paper. A novel binary convolutional encoder-decoder network (B-CEDNet) together with a bidirectional recurrent neural network (Bi-RNN). The B-CEDNet is engaged as a visual front-end to provide elaborated character detection, and a back-end Bi-RNN performs character-level sequential correction and classification based on learned contextual knowledge. The front-end B-CEDNet can process multiple regions containing characters using a one-off forward operation, and is trained under binary constraints with significant compression. Hence it leads to both remarkable inference run-time speedup as well as memory usage reduction. With the elaborated character detection, the back-end Bi-RNN merely processes a low dimension feature sequence with category and spatial information of extracted characters for sequence correction and classification. By training with over 1,000,000 synthetic scene text images, the B-CEDNet achieves a recall rate of 0.86, precision of 0.88 and F-score of 0.87 on ICDAR-03 and ICDAR-13. With the correction and classification by Bi-RNN, the proposed real-time scene text recognition achieves state-of-the-art accuracy while only consumes less than 1-ms inference run-time. The flow processing flow is realized on GPU with a small network size of 1.01 MB for B-CEDNet and 3.23 MB for Bi-RNN, which is much faster and smaller than the existing solutions.
- North America > Mexico > Gulf of Mexico (0.05)
- North America > United States > Arizona (0.04)
- Asia > Singapore (0.04)
- Asia > China (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.84)