This project is read-only.

Parallel HOCR

Aug 9, 2016 at 3:38 AM
I have another project that uses Tesseract.Net wrapper to provide OCR. It gives me the ability to process a group of pages in parallel (parallel.foreach).

I would also like to do the same for a new project using

Is it possible to supply the HOCR via your solution? Instead of calling doc.OCR(...), call doc.HOCR = "hocrValue". (or similar call)

This will allow me to do all OCR operations in parallel, then create the PDFs after all the HOCR is complete (stored in a SQLite table).