OCR

Using gImageReader for OCR

In a previous post I went through the process of writing a shell script for optical character recognition from the terminal or console. While the post was educational enough in that one had to learn bash scripting in the process, I have to admit that it was not optimal especially for a production environment.

Building a Simple OCR Application with Tesseract

The bane of anyone doing some text-processing when he or she is blind is coming across inaccessible content. This post is going to look at how to work with digitised images and scanned PDF from the terminal. Along the way, we will develop a rudimentary program to process PDF files into plain text.