This project is read-only.

Missing hocr file for tesseract call?

May 23, 2011 at 2:06 AM

Something I found that I had to do to get this library to work was to create an "hocr" configuration file for the Tesseract call.

It was throwing a "file not found" exception in the ParseHOCR method - basically, the *.hocr.html files weren't being created by Tesseract.

The fix seemed to be creating a text file named "hocr" (no extension) in the \bin\Debug directory of my project.  The file contains a single line:


tessedit_create_hocr 1


I'm not sure that this is the same config file the original developer was using, but it seems to be working-ish.

May 28, 2011 at 5:39 AM

It is. For some reason the tesseract download from it's homepage doesn't install this file even though the feature exists. You solved it. I should update the downloads and instructions to include this file and where to copy it to.