rOpenSci Support for hOCR and Tesseract 4 in R
Earlier this month we released a new version of the tesseract package to CRAN. This package provides R bindings to Google's open source optical character recognition (OCR) engine Tesseract. Two major new features are support for HOCR and support for the upcoming Tesseract 4. Support for HOCR output was requested by one of our users on Github. Every word in the hOCR output includes meta data such as bounding box, confidence metrics, etc. So this gives us a little more information about the OCR results than just the text.
Feb-13-2018, 23:47:10 GMT
- Technology: