Improving OCR Results with Basic Image Processing - PyImageSearch

#artificialintelligence 

In our previous tutorial, you learned how to improve the accuracy of Tesseract OCR by supplying the appropriate page segmentation mode (PSM). The PSM allows you to select a segmentation method dependent on your particular image and the environment in which it was captured. However, there are times when changing the PSM is not sufficient, and you instead need to use a bit of computer vision and image processing to clean up the image before you pass it through the Tesseract OCR engine. To learn how to improve OCR results using basic image processing, just keep reading. Exactly which image processing algorithms or techniques you utilize is heavily dependent on your exact situation, project requirements, and input images; however, with that said, it's still important to gain experience applying image processing to clean up images before OCR'ing them.