AITopics

2111.11011

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.64)

arXiv.org Artificial IntelligenceApr-25-2021

Parallel Scale-wise Attention Network for Effective Scene Text Recognition

Sajid, Usman, Chow, Michael, Zhang, Jin, Kim, Taejoon, Wang, Guanghui

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields promising results, these schemes inherit noticeable limitations. They perform the feature extraction (FE) and visual attention (VA) sequentially, which bounds the attention mechanism to rely only on the FE final single-scale output. Moreover, the utilization of the attention process is limited by only applying it directly to the single scale feature-maps. To address these issues, we propose a new multi-scale and encoder-based attention network for text recognition that performs the multi-scale FE and VA in parallel. The multi-scale channels also undergo regular fusion with each other to develop the coordinated knowledge together. Quantitative evaluation and robustness analysis on the standard benchmarks demonstrate that the proposed network outperforms the state-of-the-art in most cases.

deep learning, neural network, text recognition, (18 more...)

2104.12076

Country: North America > United States > Kansas > Douglas County > Lawrence (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.84)

arXiv.org Artificial IntelligenceSep-22-2020

Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition

Li, Bingcong, Tang, Xin, Qi, Xianbiao, Chen, Yihao, Xiao, Rong

Recently, inspired by Transformer, self-attention-based scene text recognition approaches have achieved outstanding performance. However, we find that the size of model expands rapidly with the lexicon increasing. Specifically, the number of parameters for softmax classification layer and output embedding layer are proportional to the vocabulary size. It hinders the development of a lightweight text recognition model especially applied for Chinese and multiple languages. Thus, we propose a lightweight scene text recognition model named Hamming OCR. In this model, a novel Hamming classifier, which adopts locality sensitive hashing (LSH) algorithm to encode each character, is proposed to replace the softmax regression and the generated LSH code is directly employed to replace the output embedding. We also present a simplified transformer decoder to reduce the number of parameters by removing the feed-forward network and using cross-layer parameter sharing technique. Compared with traditional methods, the number of parameters in both classification and embedding layers is independent on the size of vocabulary, which significantly reduces the storage requirement without loss of accuracy. Experimental results on several datasets, including four public benchmaks and a Chinese text dataset synthesized by SynthText with more than 20,000 characters, shows that Hamming OCR achieves competitive results.

artificial intelligence, hamming ocr, neural network, (15 more...)

2009.10874

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceSep-23-2019, 10:28:43 GMT

Build a Handwritten Text Recognition System using TensorFlow

Offline Handwritten Text Recognition (HTR) systems transcribe text contained in scanned images into digital text, an example is shown in Figure 1. We will build a Neural Network (NN) which is trained on word-images from the IAM dataset. As the input layer (and therefore also all the other layers) can be kept small for word-images, NN-training is feasible on the CPU (of course, a GPU would be better). This implementation is the bare minimum that is needed for HTR using TF. We use a NN for our task.

artificial intelligence, neural network, sequence, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.62)
Information Technology > Artificial Intelligence > Vision > Handwriting Recognition (0.61)

#artificialintelligenceFeb-19-2019, 17:52:12 GMT

Machine Learning with Python: NLP and Text Recognition

Student and freelance AI / Big Data Developer with a passion for full stack. In this article, I apply a series of natural language processing techniques on a dataset containing reviews about businesses. After that, I train a model using Logistic Regression to forecast if a review is "positive" or "negative". The natural language processing field contains a series of tools that are very useful to extract, label, and forecast information starting from raw text data. This collection of techniques are mainly used in the field of emotions recognition, text tagging (for example to automatize the process of sorting complaints from a client), chatbots, and vocal assistants.

artificial intelligence, natural language, positive review, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)

#artificialintelligenceJan-6-2019, 13:35:00 GMT

ML Kit Android: Implementing Text Recognition -- Firebase

Firebase is now set up, we can now start building our Text Recognition app. We need Firebase ML Vision dependency, we add it in our app-level build.grade After capturing the image from the camera, we'll set the image into the ImageView as: Our app is ready to use. Run the app and click on the camera icon to launch the camera on your Android Device. Click a picture of some text, then click on tick icon and watch Firebase do the magic for you.

artificial intelligence, firebase, machine learning, (4 more...)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.66)

EngadgetOct-10-2018, 10:18:55 GMT

Dropbox text recognition makes it easier to find images and PDFs

There's nothing worse than having to pore over a pile of PDFs containing documents scanned as images when you quickly have to find a specific file. Dropbox is making it easier to do that by introducing automatic image recognition, which extracts texts from photos and PDFs and makes them searchable. According to the cloud storage provider, there are 20 billion image and PDF files stored on Dropbox. Around 10 to 20 percent of those are photos of documents, so the new feature can be very, very useful. To look for a specific photo or PDF, you simply have to type in a keyword or phrase like you would on a search engine.

artificial intelligence, dropbox, information technology services, (4 more...)

Engadget

Technology:

Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.44)

#artificialintelligenceSep-11-2018, 18:05:21 GMT

Facebook is making AI that can identify offensive memes

Facebook's moderators can't possibly look through every single image that gets posted on the enormous platform, so Facebook is building AI to help them out. In a blog post today, Facebook describes a system it's built called Rosetta that uses machine learning to identify text in images and videos and then transcribe it into something that's machine readable. In particular, Facebook is finding this tool helpful for transcribing the text on memes. Text transcription tools are nothing new, but Facebook faces different challenges because of the size of its platform and the variety of the images it sees. Rosetta is said to be live now, extracting text from 1 billion images and video frames per day across both Facebook and Instagram.

artificial intelligence, facebook, social media, (5 more...)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.40)

arXiv.org Artificial IntelligenceAug-2-2018

Double Supervised Network with Attention Mechanism for Scene Text Recognition

Gao, Yuting, Huang, Zheng, Dai, Yuchen

In this paper, we propose Double Supervised Network with Attention Mechanism (DSAN), a novel end-to-end trainable framework for scene text recognition. It incorporates one text attention module during feature extraction which enforces the model to focus on text regions and the whole framework is supervised by two branches. One supervision branch comes from context-level modelling and another comes from one extra supervision enhancement branch which aims at tackling inexplicit semantic information at character level. These two supervisions can benefit each other and yield better performance. The proposed approach can recognize text in arbitrary length and does not need any predefined lexicon. Our method outperforms the current state-of-the-art methods on three text recognition benchmarks: IIIT5K, ICDAR2013 and SVT reaching accuracy 88.6%, 92.3% and 84.1% respectively which suggests the effectiveness of the proposed method.

deep learning, neural network, text recognition, (16 more...)

1808.00677

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (0.87)

#artificialintelligenceMar-25-2018, 08:36:50 GMT

STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR, a single semi-supervised Deep Neural Network(DNN), consist of a spatial transformer network -- which is used to detected text regions in images, and a text recognition network -- which recognizes the textual content of the identified text regions. STN-OCR is an end-to-end scene text recognition system, but it is not easy to train. This model is mostly able to detect text in differently arranged lines of text in images, while also recognizing the content of these words. The overview of the system is shown in Figure 1. Compared with most of the current text recognition systems, which extract all the information from the image at once, STN-OCR behaves more like a human.

deep learning, detection stage, neural network, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Text Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)