A Novel Pipeline for Improving Optical Character Recognition through Post-processing Using Natural Language Processing