OCR a document, form, or invoice with Tesseract, OpenCV, and Python - PyImageSearch
In this tutorial, you will learn how to OCR a document, form, or invoice using Tesseract, OpenCV, and Python. On the left, we have our template image (i.e., a form from the United States Internal Revenue Service). The middle figure is our input image that we wish to align to the template (thereby allowing us to match fields from the two images together). And finally, the right shows the output of aligning the two images together. At this point, we can associate text fields in the form with each corresponding field in the template, meaning that we know which locations of the input image map to the name, address, EIN, etc. fields of the template: Knowing where and what the fields are allows us to then OCR each individual field and keep track of them for further processing, such as automated database entry.
Sep-14-2020, 14:37:16 GMT