AITopics | Optical Character Recognition

Collaborating Authors

Optical Character Recognition

Our second example deals with a more challenging problem: the recognition of hand-printed letters of the alphabet. The characters that people print in the ordinary course of filling out forms and questionnaires are surprisingly varied. Gaps abound wherecontinuous lines might be expected; curves and sharp angles appear interchangeably; there is almost every imaginable distortion of slant, shape and size. Even human readers cannot always identify such characters; their error rate is about 3 per cent on randomly selected letters and numbers, seen out of context.
– from Oliver G. Selfridge & Ulric Neisser. PATTERN RECOGNITION BY MACHINE . In Computers & thought, Edward A. Feigenbaum and Julian Feldman (Eds.). MIT Press, Cambridge, MA, USA, 1963. pp. 8-30.

News Overviews Instructional Materials AI-Alerts Classics

6 cognitive automation use cases in the enterprise

#artificialintelligenceJul-15-2020, 20:29:54 GMT

Cognitive automation is an extension of existing robotic process automation (RPA) technology. Machine learning enables bots to remember the best ways of completing tasks, while technology like optical character recognition increases the data formats with which bots can interact. Cognitive automation adds a layer of AI to RPA software to enhance the ability of RPA bots to complete tasks that require more knowledge and reasoning. These tasks can range from answering complex customer queries to extracting pertinent information from document scans. Some examples of mature cognitive automation use cases include intelligent document processing and intelligent virtual agents. In contrast, Modi sees intelligent automation as the automation of more rote tasks and processes by combining RPA and AI.

artificial intelligence, cognitive automation use case, machine learning, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.64)
Information Technology > Artificial Intelligence > Robots (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.64)
Information Technology > Artificial Intelligence > Machine Learning (0.64)

Add feedback

Object Detection on Newspaper images using YoloV3

#artificialintelligenceJul-14-2020, 16:38:32 GMT

I was trying my hand on Optical Character Recognition on newspaper images when I realised that most documents have sections and text is not necessarily across the entire horizontal space of the page. Even though Tesseract was able to recognise the text it was jumbled up. To fix this the model should be able to identify sections on the document and draw a bounding box around it an perform OCR. It was this moment when applying Yolo Object detection on such images came into mind. YOLOv3 is extremely fast and accurate.

artificial intelligence, object detection, optical character recognition, (8 more...)

#artificialintelligence

Industry: Media > News (0.63)

Technology: Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.57)

Add feedback

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR THE INDIAN NAVY - National Maritime Foundation

#artificialintelligenceJul-14-2020, 16:35:30 GMT

Artificial Intelligence (AI) -- and its attendant term, 'Machine Learning' (ML) -- is described as the capability of a computer system to perform tasks that normally require human intelligence, such as visual perception, speech recognition and decision-making. Almost all AI/ML examples in commercial as well as military use today rely on data stores that drive deep learning and natural language processing.[1] The defining feature of an AI/ML system is its ability to learn and solve problems. There has been a gradual change in our understanding of what exactly constitutes AI. While advancements in computer hardware and more efficient software have led to the development of AI systems, hitherto computer-resource-intensive tasks, such as optical character recognition (OCR) are now considered a routine technology and, hence, no longer included in any contemporary discussion of AI/ML.

artificial intelligence, machine learning, optical character recognition, (11 more...)

#artificialintelligence

Country:

Asia > India (0.45)
North America > United States (0.29)
Europe > Russia (0.04)
(3 more...)

Industry:

Government > Military > Navy (0.87)
Government > Regional Government > Asia Government > India Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.54)

Add feedback

Text to Speech Technology: How Voice Computing is Building a More Accessible World

#artificialintelligenceJun-29-2020, 22:16:22 GMT

In a world where new technology emerges at exponential rates, and our daily lives are increasingly mediated by speakers and sound waves, text to speech technology is the latest force evolving the way we communicate. Text to speech technology refers to a field of computer science that enables the conversion of language text into audible speech. Also known as voice computing, text to speech (TTS) often involves building a database of recorded human speech to train a computer to produce sound waves that resemble the natural sound of a human speaking. This process is called speech synthesis. The technology is trailblazing and major breakthroughs in the field occur regularly.

artificial intelligence, optical character recognition, speech technology, (10 more...)

#artificialintelligence

Country: North America > United States (0.05)

Industry:

Health & Medicine (0.49)
Leisure & Entertainment (0.48)
Media (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)

Add feedback

r/MachineLearning - [2006.04558] FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech

#artificialintelligenceJun-9-2020, 08:09:56 GMT

Abstract: Advanced text-to-speech (TTS) models such as FastSpeech can synthesize speech significantly faster than previous autoregressive models with comparable quality. The training of FastSpeech model relies on an autoregressive teacher model for duration prediction (to provide more information as input) and knowledge distillation (to simplify the data distribution in output), which can ease the one-to-many mapping problem (i.e., multiple speech variations correspond to the same text) in TTS. However, FastSpeech has several disadvantages: 1) the teacher-student distillation pipeline is complicated, 2) the duration extracted from the teacher model is not accurate enough, and the target mel-spectrograms distilled from teacher model suffer from information loss due to data simplification, both of which limit the voice quality. In this paper, we propose FastSpeech 2, which addresses the issues in FastSpeech and better solves the one-to-many mapping problem in TTS by 1) directly training the model with ground-truth target instead of the simplified output from teacher, and 2) introducing more variation information of speech (e.g., pitch, energy and more accurate duration) as conditional inputs. Specifically, we extract duration, pitch and energy from speech waveform and directly take them as conditional inputs during training and use predicted values during inference. We further design FastSpeech 2s, which is the first attempt to directly generate speech waveform from text in parallel, enjoying the benefit of full end-to-end training and even faster inference than FastSpeech.

artificial intelligence, optical character recognition, social media, (13 more...)

#artificialintelligence

Industry: Media > News (0.40)

Technology:

Information Technology > Communications > Social Media (0.76)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.64)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.64)

Add feedback

The AI Pinball Player That Could Beat Humans Within 4 Days

#artificialintelligenceMay-21-2020, 02:20:37 GMT

Developers have taught artificial intelligence how to play an arcade pinball machine, which learned so quickly it could beat human players within four days. Speaking at Microsoft's developer conference, Build, which is being held virtually this week, Jack Skinner described how he and a team of developers in Sydney used artificial intelligence to control an actual pinball machine. The team took a regular arcade machine and adapted it, using a Windows computer to control the AI software, and a Raspberry Pi to control the flipper mechanism within the pinball machine. Two webcams were mounted on the pinball machine - one pointed at the scoreboard and one pointed down at the table - so that the AI could "see" the table like a human player would. Optical character recognition (OCR) software allowed the computer to read the current score from the pinball machine's electronic display.

artificial intelligence, machine learning, pinball machine, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.93)

Add feedback

AI-Powered Biotech Can Help Deploy a Vaccine In Record Time

WIREDMay-19-2020, 13:52:58 GMT

The magnitude of the Covid-19 pandemic will largely depend on how quickly safe and effective vaccines and treatments can be developed and tested. Many assume a widely available vaccine is years away, if ever. Others believe that a 12- to 18-month development cycle is a given. Our best bet to reduce even that record-breaking timeline is by using artificial intelligence. The problem is twofold: discovering the right set of molecules among billions of possibilities, and then waiting for clinical trials. These processes ordinarily take several years, but AI holds the key to radically shortening both.

artificial intelligence, machine learning, optical character recognition, (12 more...)

WIRED

AI-Alerts: 2020 > 2020-06 > AAAI AI-Alert Ethics for Jun 16, 2020 (1.00)

Country: Oceania > Australia > South Australia > Adelaide (0.05)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.50)
Information Technology > Artificial Intelligence > Machine Learning (0.50)
Information Technology > Artificial Intelligence > Applied AI (0.36)
Information Technology > Artificial Intelligence > Speech (0.31)

Add feedback

From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character Recognition

Heidarysafa, Mojtaba, Reed, James, Kowsari, Kamran, Leviton, April Celeste R., Warren, Janet I., Brown, Donald E.

arXiv.org Artificial IntelligenceMay-19-2020

Tracking users' activities on the World Wide Web (WWW) allows researchers to analyze each user's internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant's behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users' activities during an online session given that the session recorder collected the data.

artificial intelligence, machine learning, optical character recognition, (11 more...)

arXiv.org Artificial Intelligence

1811.06193

Country:

North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > Virginia (0.05)
North America > United States > Michigan (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Gaussian Process Upsampling Model for Improvements in Optical Character Recognition

Reeves, Steven I, Lee, Dongwook, Singh, Anurag, Verma, Kunal

arXiv.org Machine LearningMay-7-2020

Optical Character Recognition and extraction is a key tool in the automatic evaluation of documents in a financial context. However, the image data provided to automated systems can have unreliable quality, and can be inherently low-resolution or downsampled and compressed by a transmitting program. In this paper, we illustrate the efficacy of a Gaussian Process upsampling model for the purposes of improving OCR and extraction through upsampling low resolution documents.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2005.0378

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Newspaper Navigator

University of Washington Computer ScienceMay-6-2020, 21:24:28 GMT

Welcome to the Newspaper Navigator dataset! This dataset consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. The dataset also includes text corresponding to the visual content, identified by extracting the Optical Character Recognition, or OCR, within each predicted bounding box. For example, if the visual content recognition model predicted a bounding box around a headline, the corresponding textual content provides a machine-readable version of the headline; likewise, for a photograph, illustration, or map, this textual representation often contains the title and caption.

artificial intelligence, dataset, optical character recognition, (14 more...)

University of Washington Computer Science

Industry: Media > News (0.91)

Technology: Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.55)

Add feedback