Goto

Collaborating Authors

 Optical Character Recognition


The AI Pinball Player That Could Beat Humans Within 4 Days

#artificialintelligence

Developers have taught artificial intelligence how to play an arcade pinball machine, which learned so quickly it could beat human players within four days. Speaking at Microsoft's developer conference, Build, which is being held virtually this week, Jack Skinner described how he and a team of developers in Sydney used artificial intelligence to control an actual pinball machine. The team took a regular arcade machine and adapted it, using a Windows computer to control the AI software, and a Raspberry Pi to control the flipper mechanism within the pinball machine. Two webcams were mounted on the pinball machine - one pointed at the scoreboard and one pointed down at the table - so that the AI could "see" the table like a human player would. Optical character recognition (OCR) software allowed the computer to read the current score from the pinball machine's electronic display.


AI-Powered Biotech Can Help Deploy a Vaccine In Record Time

WIRED

The magnitude of the Covid-19 pandemic will largely depend on how quickly safe and effective vaccines and treatments can be developed and tested. Many assume a widely available vaccine is years away, if ever. Others believe that a 12- to 18-month development cycle is a given. Our best bet to reduce even that record-breaking timeline is by using artificial intelligence. The problem is twofold: discovering the right set of molecules among billions of possibilities, and then waiting for clinical trials. These processes ordinarily take several years, but AI holds the key to radically shortening both.


From Videos to URLs: A Multi-Browser Guide To Extract User's Behavior with Optical Character Recognition

arXiv.org Artificial Intelligence

Tracking users' activities on the World Wide Web (WWW) allows researchers to analyze each user's internet behavior as time passes and for the amount of time spent on a particular domain. This analysis can be used in research design, as researchers may access to their participant's behaviors while browsing the web. Web search behavior has been a subject of interest because of its real-world applications in marketing, digital advertisement, and identifying potential threats online. In this paper, we present an image-processing based method to extract domains which are visited by a participant over multiple browsers during a lab session. This method could provide another way to collect users' activities during an online session given that the session recorder collected the data.


A Gaussian Process Upsampling Model for Improvements in Optical Character Recognition

arXiv.org Machine Learning

Optical Character Recognition and extraction is a key tool in the automatic evaluation of documents in a financial context. However, the image data provided to automated systems can have unreliable quality, and can be inherently low-resolution or downsampled and compressed by a transmitting program. In this paper, we illustrate the efficacy of a Gaussian Process upsampling model for the purposes of improving OCR and extraction through upsampling low resolution documents.


Newspaper Navigator

University of Washington Computer Science

Welcome to the Newspaper Navigator dataset! This dataset consists of extracted visual content for 16,358,041 historic newspaper pages in Chronicling America. The visual content was identified using an object detection model trained on annotations of World War 1-era Chronicling America pages, including annotations made by volunteers as part of the Beyond Words crowdsourcing project. The dataset also includes text corresponding to the visual content, identified by extracting the Optical Character Recognition, or OCR, within each predicted bounding box. For example, if the visual content recognition model predicted a bounding box around a headline, the corresponding textual content provides a machine-readable version of the headline; likewise, for a photograph, illustration, or map, this textual representation often contains the title and caption.


Judge Dismisses Lawsuit Over Mail Delivery

U.S. News

The apartment complexes near Western Kentucky University sued the United States Postal Service and a postmaster in January after the agency began delivering mail in bulk to property management offices instead of tenants' mailboxes. The change came after the Postal Service reclassified the residences as dormitories, according to the lawsuit.


Shape Context descriptor and fast characters recognition

#artificialintelligence

Matching shapes can be much difficult task then just matching images, for example recognition of hand-written text, or fingerprints. Because most of shapes that we trying to match is heavy augmented. I can bet that you will never write to identical letters for all your life. And look at this from the point of people detection algorithm based on handwriting matching -- it would be just hell. Of course in the age of Neural networks and RNNs it also can be solved in a different way then just straight mathematics, but not always you can use heavy and memory hungry things like NNs.


How to Use Optical Character Recognition for Security System Development

#artificialintelligence

Applying machine learning techniques to security solutions is one of the current AI trends. This article will cover the approach to developing OCR-based software using deep learning algorithms. This software can be used to analyze and process identification such as a US driver's license as part of a security system for verifying identity. OCR (Optical Character Recognition) technology is already used by machine learning companies for business processes automation and optimization, with use cases ranging from Dropbox using it to parse through pictures to Google Street view identifying different street signs to searching through text messages and translating text in real time. In this particular case, OCR can be used as part of an automated biometric verification system.


FastSpeech: Fast, Robust and Controllable Text to Speech

Neural Information Processing Systems

Neural network based end-to-end text to speech (TTS) has significantly improved the quality of synthesized speech. Prominent methods (e.g., Tacotron 2) usually first generate mel-spectrogram from text, and then synthesize speech from the mel-spectrogram using vocoder such as WaveNet. Compared with traditional concatenative and statistical parametric approaches, neural network based end-to-end models suffer from slow inference speed, and the synthesized speech is usually not robust (i.e., some words are skipped or repeated) and lack of controllability (voice speed or prosody control). In this work, we propose a novel feed-forward network based on Transformer to generate mel-spectrogram in parallel for TTS. Specifically, we extract attention alignments from an encoder-decoder based teacher model for phoneme duration prediction, which is used by a length regulator to expand the source phoneme sequence to match the length of the target mel-spectrogram sequence for parallel mel-spectrogram generation. Experiments on the LJSpeech dataset show that our parallel model matches autoregressive models in terms of speech quality, nearly eliminates the problem of word skipping and repeating in particularly hard cases, and can adjust voice speed smoothly.


Utopia Global Releases Cloud-Based Intelligent Data Capture and Control Software Platform Delivers High Quality Enriched Asset Master Data Leveraging Machine Learning

#artificialintelligence

IDCC uniquely leverages optical character recognition, Utopia's advanced machine learning code, intelligent online web search, and document search. Beginning simply with only a photo of a manufacturer's nameplate, IDCC can produce complete and accurate material and asset information. Manufacturer and model data is organized in ISO-14224 standards and can be delivered via a variety of easy-to-integrate methods, including SAP Asset Intelligence Network . The cloud-based nature of IDCC enables cost-effective, rapid deployments by large and small organizations alike. IDCC can be deployed in pure cloud environments, such as SAP Intelligent Asset Management, or hybrid deployments using SAP Master Data Governance, enterprise asset management extension by Utopia.