Collaborating Authors

Optical Character Recognition



Retrieving information from documents and forms has long been a challenge, and even now at the time of writing, organisations are still handling significant amounts of paper forms that need to be scanned, classified and mined for specific information to enable downstream automation and efficiencies. Automating this extraction and applying intelligence is in fact a fundamental step toward digital transformation that organisations are still struggling to solve in an efficient and scalable manner. An example could be a bank that receives hundreds of kilograms of very diverse remittance forms a day that need to be processed manually by people in order to extract a few key fields. Or medicinal prescriptions need to be automated to extract the prescribed medication and quantity. Typically organisations will have built text mining and search solutions which are often tailored for a scenario, with baked in application logic, resulting in an often brittle solution that is difficult and expensive to maintain.

Artificial Intelligence and Machine Learning – Path to Intelligent Automation


With evolving technologies, intelligent automation has become a top priority for many executives in 2020. Forrester predicts the industry will continue to grow from $250 million in 2016 to $12 billion in 2023. With more companies identifying and implementation the Artificial Intelligence (AI) and Machine Learning (ML), there is seen a gradual reshaping of the enterprise. Industries across the globe integrate AI and ML with businesses to enable swift changes to key processes like marketing, customer relationships and management, product development, production and distribution, quality check, order fulfilment, resource management, and much more. AI includes a wide range of technologies such as machine learning, deep learning (DL), optical character recognition (OCR), natural language processing (NLP), voice recognition, and so on, which creates intelligent automation for organizations across multiple industrial domains when combined with robotics.

Become a speed reading machine with this online class


TL;DR: The Become a Speed Reading Machine course is on sale for £19.14 as of August 5, saving you 87% on list price. If you're being honest, you've probably always been secretly -- and irrationally -- jealous of speedy readers. Back in school, there were always a few classmates who zoomed through a dense chapter and got to start lunch early. The rest of us were stuck decoding a confusing, run-on sentence while our milk got warm. Now those kids are colleagues who answer emails quicker, read more news, and are arguably more productive throughout the day.

Computer Vision: Python OCR & Object Detection Quick Starter


This is the third course from my Computer Vision series. Image Recognition, Object Detection, Object Recognition and also Optical Character Recognition are among the most used applications of Computer Vision. Using these techniques, the computer will be able to recognize and classify either the whole image, or multiple objects inside a single image predicting the class of the objects with the percentage accuracy score. Using OCR, it can also recognize and convert text in the images to machine readable format like text or a document. Object Detection and Object Recognition is widely used in many simple applications and also complex ones like self driving cars.

The Building Blocks of Artificial Intelligence


Machine vision is the classification and tracking of real-world objects based on visual, x-ray, laser, or other signals. Optical character recognition was an early success of machine vision, but deciphering handwritten text remains a work in progress. The quality of machine vision depends on human labeling of a large quantity of reference images. The simplest way for machines to start learning is through access to this labeled data. Within the next five years, video-based computer vision will be able to recognize actions and predict motion--for example, in surveillance systems.

Extracting custom entities from documents with Amazon Textract and Amazon Comprehend


Amazon Textract is a machine learning (ML) service that makes it easy to extract text and data from scanned documents. Textract goes beyond simple optical character recognition (OCR) to identify the contents of fields in forms and information stored in tables. This allows you to use Amazon Textract to instantly "read" virtually any type of document and accurately extract text and data without needing any manual effort or custom code. Amazon Textract has multiple applications in a variety of fields. For example, talent management companies can use Amazon Textract to automate the process of extracting a candidate's skill set.

Cognitive AI and the Power of Intelligent Data Digitalization


In a quest to decode what keeps the world moving, enterprises across the world are baffled. It is not precious metals or even cryptocurrency – it is data. The adage that data is the new oil holds true and soon, every company in the world will either buy or sell data, and the value of this corporate asset would gain prominence with each passing day. Data fuels digital transformation that drives a mammoth disruption across all industries. It is the key differentiator, coming at a massive speed characterised by volume, variety, velocity and veracity in a very live environment.

6 cognitive automation use cases in the enterprise


Cognitive automation is an extension of existing robotic process automation (RPA) technology. Machine learning enables bots to remember the best ways of completing tasks, while technology like optical character recognition increases the data formats with which bots can interact. Cognitive automation adds a layer of AI to RPA software to enhance the ability of RPA bots to complete tasks that require more knowledge and reasoning. These tasks can range from answering complex customer queries to extracting pertinent information from document scans. Some examples of mature cognitive automation use cases include intelligent document processing and intelligent virtual agents. In contrast, Modi sees intelligent automation as the automation of more rote tasks and processes by combining RPA and AI.

Object Detection on Newspaper images using YoloV3


I was trying my hand on Optical Character Recognition on newspaper images when I realised that most documents have sections and text is not necessarily across the entire horizontal space of the page. Even though Tesseract was able to recognise the text it was jumbled up. To fix this the model should be able to identify sections on the document and draw a bounding box around it an perform OCR. It was this moment when applying Yolo Object detection on such images came into mind. YOLOv3 is extremely fast and accurate.



Artificial Intelligence (AI) -- and its attendant term, 'Machine Learning' (ML) -- is described as the capability of a computer system to perform tasks that normally require human intelligence, such as visual perception, speech recognition and decision-making. Almost all AI/ML examples in commercial as well as military use today rely on data stores that drive deep learning and natural language processing.[1] The defining feature of an AI/ML system is its ability to learn and solve problems. There has been a gradual change in our understanding of what exactly constitutes AI. While advancements in computer hardware and more efficient software have led to the development of AI systems, hitherto computer-resource-intensive tasks, such as optical character recognition (OCR) are now considered a routine technology and, hence, no longer included in any contemporary discussion of AI/ML.