Parallel HOCR

Aug 9, 2016 at 2:38 AM
I have another project that uses Tesseract.Net wrapper to provide OCR. It gives me the ability to process a group of pages in parallel (parallel.foreach).

I would also like to do the same for a new project using hocr2pdf.net.

Is it possible to supply the HOCR via your solution? Instead of calling doc.OCR(...), call doc.HOCR = "hocrValue". (or similar call)

This will allow me to do all OCR operations in parallel, then create the PDFs after all the HOCR is complete (stored in a SQLite table).