Optical Character Recognition


Ray Kurzweil (USA) at Ci2019 - The Future of Intelligence, Artificial and Natural

#artificialintelligence

Called "the restless genius" by The Wall Street Journal and "the ultimate thinking machine" by Forbes magazine, he was selected as one of the top entrepreneurs by Inc. magazine, which described him as the "rightful heir to Thomas Edison." PBS selected him as one of the "sixteen revolutionaries who made America." Ray was the principal inventor of the first CCD flat-bed scanner, the first omni-font optical character recognition, the first print-to-speech reading machine for the blind, the first text-to-speech synthesizer, the first music synthesizer capable of recreating the grand piano and other orchestral instruments, and the first commercially marketed large-vocabulary speech recognition. Among Ray's many honors, he received a Grammy Award for outstanding achievements in music technology; he is the recipient of the National Medal of Technology, was inducted into the National Inventors Hall of Fame, holds twenty-one honorary Doctorates, and honors from three U.S. presidents. Ray has written five national best-selling books, including New York Times best sellers The Singularity Is Near (2005) and How To Create A Mind (2012). He is Co-Founder and Chancellor of Singularity University and a Director of Engineering at Google heading up a team developing machine intelligence and natural language understanding.


Medline streamlines workflow by automating accounts payable

#artificialintelligence

Medline Industries, a manufacturer and distributor of medical supplies, based in Northfield, Ill., is growing quickly, said Sarah Stokes, director of accounts payable at the company. That means double-digit sales growth year over year, she said, but it also means more paperwork. "Our biggest challenge is trying to keep up with the volume," Stokes said. To help tackle the 2,000 invoices the company receives each day, Medline has turned to optical character recognition (OCR) technology from vendor Abbyy, paired with platforms from several RPA vendors. The combination has gone a long way in automating accounts payable, Stokes said.


BlinkID

#artificialintelligence

MRZ is an abbreviation for Machine Readable Zone, whereas MRTD refers to Machine Readable Travel Document. MRZ is a format found on most passports and identity documents worldwide. It contains the document holder's data in a form which is both visually readable and encoded with optical character recognition. Built on the latest advances in machine learning, BlinkID enables scanning of MRZ with unsurpassed accuracy and speed.


Jumio Announces Gains in Speed, Accuracy, User Experience

#artificialintelligence

Jumio, the AI-powered trusted identity as a service provider, announced gains in the speed and accuracy of its verification services, as well as a more intuitive user experience. These gains come after a two year period during which Jumio invested heavily automation enabled through a variety of machine learning, artificial intelligence, and optical character recognition (OCR). As a result of its investment in supervised machine learning models, Jumio was also able to recently launch Jumio Go, its new identity verification solution powered exclusively by AI. The investment has also improved Jumio's current suite of identity verification and authentication services. "We're seeing across-the-board improvements in our ability to automate virtually every phase of the identity verification process, making our core solutions even faster, easier and more accurate for our customers and their end users," said Labhesh Patel, Jumio CTO and chief scientist.


DeepMind Uses GANs to Convert Text to Speech

#artificialintelligence

Generative Adversarial Networks (GANs) have revolutionized high-fidelity image generation, making global headlines with their hyperrealistic portraits and content-swapping, while also raising concerns with convincing deepfake videos. Now, DeepMind researchers are expanding GANs to audio, with a new adversarial network approach for high fidelity speech synthesis. Text-to-Speech (TTS) is a process for converting text into a humanlike voice output. One of the most commonly used TTS network architectures is WaveNet, a neural autoregressive model for generating raw audio waveforms. But because WaveNet relies on the sequential generation of one audio sample at a time, it is poorly suited to today's massively parallel computers.


Who Uses Text to Speech (TTS) Anyway? - ReadSpeaker

#artificialintelligence

First things first: what is TTS? TTS or Text-to-Speech technology converts text into spoken speech. If you know Siri or those handy voice GPS directions on smartphones, then congratulations! Since 1000 AD, humans have strived to create synthetic speech, but it didn't enter the mainstream until the mid 1970s – early 1980s when computer operating systems began implementing it. Walt Tetschner, leader of the group that produced DECtalk in 1983, explains that while the voice wasn't perfect, it was still natural sounding and was used by companies such as MCI and Mtel (two-way paging).


Learn about the benefits of text to speech

#artificialintelligence

Every end user is a customer, and the quality of the customer journey is everything, regardless of whether the objective is purchasing a product or service or engaging in content fruition. End users can be website visitors, application, device, service, and machine users, online learners or teachers, and more. Text to speech allows content owners to respond to the different needs and desires of each user in terms of how they interact with the content.


KuroNet: Pre-Modern Japanese Kuzushiji Character Recognition with Deep Learning

#artificialintelligence

Kuzushiji, a cursive writing style, had been used in Japan for over a thousand years starting from the 8th century. Over 3 millions books on a diverse array of topics, such as literature, science, mathematics and even cooking are preserved. However, following a change to the Japanese writing system in 1900, Kuzushiji has not been included in regular school curricula. Therefore, most Japanese natives nowadays cannot read books written or printed just 150 years ago. Museums and libraries have invested a great deal of effort into creating digital copies of these historical documents as a safeguard against fires, earthquakes and tsunamis.


DeepMind Uses GANs to Convert Text to Speech

#artificialintelligence

Generative Adversarial Networks (GANs) have revolutionized high-fidelity image generation, making global headlines with their hyperrealistic portraits and content-swapping, while also raising concerns with convincing deepfake videos. Now, DeepMind researchers are expanding GANs to audio, with a new adversarial network approach for high fidelity speech synthesis. Text-to-Speech (TTS) is a process for converting text into a humanlike voice output. One of the most commonly used TTS network architectures is WaveNet, a neural autoregressive model for generating raw audio waveforms. But because WaveNet relies on the sequential generation of one audio sample at a time, it is poorly suited to today's massively parallel computers.


News Details - Deloitte US uses AI to transform indirect tax recovery

#artificialintelligence

Deloitte US has deployed a technology-enabled solution -- CognitiveTax Insight (CogTax) -- to provide a more efficient analysis of clients' indirect tax data set. CogTax has the ability to analyse the full population, if desired, of clients' accounts payable transactions compared to a traditional, sampled approach. The solution can help companies to proactively avoid overpaying their indirect tax liabilities. CogTax leverages optical character recognition (OCR), along with advanced machine learning algorithms and analytics, to analyse a full population of data and documents to assist clients with indirect tax overpayment recovery and reduce the potential for future over or underpayments. "CogTax moves tax analysis from a manual, administrative process to an automated, machine learning process that brings with it accuracy and scale previously not achievable," said Deval Reddy, indirect tax principal in the multistate tax services practice, Deloitte Tax LLP.