Goto

Collaborating Authors

 Optical Character Recognition


Face recognition and OCR processing of 300 million records from US yearbooks

#artificialintelligence

A yearbook is a type of a book published annually to record, highlight, and commemorate the past year of a school. Our team at MyHeritage took on a complex project: extracting individual pictures, names, and ages from hundreds of thousands of yearbooks, structuring the data, and creating a searchable index that covers the majority of US schools between the years 1890โ€“1979 -- more than 290 million individuals. In this article I'll describe what problems we encountered during this project and how we solved them. First of all, let me explain why we needed to tackle this challenge. MyHeritage is a genealogy platform that provides access to almost 10 billion historical records.


Artificial intelligence #01: what's it all about? Practice Business

#artificialintelligence

The Royal College of General Practitioners (RCGP) adopts the AI definition found in the government's industrial strategy white paper โ€“ Industrial Strategy: building a Britain fit for the future, Department for Business, Energy and Industry Strategy, November 2017 "Technologies with the ability to perform tasks that would otherwise require human intelligence" This definition includes everything from the simplest application of AI, which relies on making decisions based on static, predefined rules and parameter checking โ€“ e.g. if x, then do y โ€“ as well as simple'yes' or'no' decision trees, right through to the dynamic complex'learning (or evolving)' algorithms which continuously review incoming data, find patterns and adapt existing algorithms (and so, 'learn'). The rise of AI is sometimes represented as a threatening new development โ€“ but many, now mundane, functions were once seen as awe-inspiring new AI developments! Everyday examples of functionality previously, but no longer, considered as AI include voice-to-text transcription โ€“ now available on any smartphone, the digitising of scanned documents through handwriting recognition or optical character recognition and spam filtering. Like it or not, health tech is being embraced at the highest NHS and government levels. Speaking at a digital health conference this summer โ€“ Unlocking the promise of digital health โ€“ Simon Stevens, chief executive of NHS England, pledged to consider reimbursement reforms to the NHS tariff and other payment systems to incentivise the uptake of AI technologies across the health system.


#RBTV What is OCR and Why Should Business Owners Care?

#artificialintelligence

It's time for business owners to leverage technology to take care of the routine back-office tasks? How exactly do they do that? Through OCR (optical character recognition) and machine learning Small business technology expert and CPA, Gene Marks explains what this means for small business owners doing manual activities like bookkeeping & accounting. The question now is, what will you do with your spare time? Check out OCR and machine learning by starting a free trial over at www.receipt-bank.com/us/for-business/


The Gaming AI Tool That Can Translate Japanese On The Fly -- AI Daily - Artificial Intelligence News

#artificialintelligence

Many gamers around the world love to play classic games such as Elder Scrolls, Super Mario and Metal Gear but one thing about older games, like ones made in the 1990s, is that they can lack localisations for each region. As well as that, some games are released exclusively in specific regions, so Japanese exclusives will only be made in Japanese. An example of this is Mother 3 that was released only in Japan after the highly acclaimed Mother 2 (Earthbound) that had a worldwide release. This meant that the fans had to translate the Japanese text if they wanted to know what was going on. RetroArch is a popular open-source gaming emulator where you can play classic games from consoles like the Gamecube on your PC.


Google Photos now lets you search for text in pictures you've taken

#artificialintelligence

Google made a subtle announcement today on Twitter: it's in the process of rolling out new AI features for its Lens platform that will let you search your Google Photos library for text that appears within photos and screenshots. Then, you'll then be able to easily copy and paste that text into a note, document, or form. Both of the new features make use of a technique known as optical character recognition (OCR), with the copy/paste option building on Lens' existing ability to understand and pull out the text found within photos, be it a screenshot or a photo of a physical sign or document. According to 9to5Google, that feature is available now on some Android devices, although it does not appear to be active quite yet on iOS. You may already be able to search your photos for text using Google Photos on the web.


Text to speech Python Tutorial

#artificialintelligence

We can make the computer speak with Python. Given a text string, it will speak the written words in the English language. This process is called Text To Speech (TTS). Pytsx is a cross-platform text-to-speech wrapper. It uses the Google Text to Speech (TTS) API.


Amazon's Text-To-Speech AI Service Sounds More Natural And Realistic

#artificialintelligence

Amazon enhanced Polly - the cloud-based text-to-speech service - to deliver natural and realistic speech synthesis. The service can now be leveraged to present domain-specific style such as newscast and sportscast. Though text-to-speech existed for more than two decades, it is never used in mainstream media due to the lack of natural and realistic modulation. Except for automated announcements that read out from existing datastores, the technology never replaced human voice and speech. Thanks to the advancements in AI, text-to-speech has evolved to become more natural and realistic to an extent that it may be hard to distinguish it from a human voice.


Japan Post could end Saturday standard mail deliveries next year after ministry moves to stop service

The Japan Times

A government panel decided Tuesday to end Saturday delivery for standard mail to deal with a labor shortage at Japan Post Co. and a drop in demand due to increased use of the internet. The Internal Affairs and Communications Ministry accepted the proposal from the panel and will seek a law amendment at an extraordinary Diet session this fall. Delivery on Saturday could be terminated possibly next year and it will be available only on weekdays. The panel also proposed that delivery for standard mail the day after posting be ended. Japan Post, a unit of Japan Post Holdings Co., has been calling for a review to trim standard mail service hours to five days a week from the current six days to address the workforce shortage.


RNN-based Online Handwritten Character Recognition Using Accelerometer and Gyroscope Data

arXiv.org Machine Learning

This abstract explores an RNN-based approach to online handwritten recognition problem. Our method uses data from an accelerometer and a gyroscope mounted on a handheld pen-like device to train and run a character pre-diction model. We have built a dataset of timestamped gyroscope and accelerometer data gathered during the manual process of handwriting Latin characters, labeled with the character being written; in total, the dataset con-sists of 1500 gyroscope and accelerometer data sequenc-es for 8 characters of the Latin alphabet from 6 different people, and 20 characters, each 1500 samples from Georgian alphabet from 5 different people. with each sequence containing the gyroscope and accelerometer data captured during the writing of a particular character sampled once every 10ms. We train an RNN-based neural network architecture on this dataset to predict the character being written. The model is optimized with categorical cross-entropy loss and RMSprop optimizer and achieves high accuracy on test data.


Rosetta: Understanding text in images and videos with machine learning - Facebook Code

#artificialintelligence

Understanding the text that appears on images is important for improving experiences, such as a more relevant photo search or the incorporation of text into screen readers that make Facebook more accessible for the visually impaired. Understanding text in images along with the context in which it appears also helps our systems proactively identify inappropriate or harmful content and keep our community safe. A significant number of the photos shared on Facebook and Instagram contain text in various forms. It might be overlaid on an image in a meme, or inlaid in a photo of a storefront, street sign, or restaurant menu. Taking into account the sheer volume of photos shared each day on Facebook and Instagram, the number of languages supported on our global platform, and the variations of the text, the problem of understanding text in images is quite different from those solved by traditional optical character recognition (OCR) systems, which recognize the characters but don't understand the context of the associated image.