Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp-private:ocr-engine-pros-and-cons [2015/04/23 19:38] (current)
ryancha created
Line 1: Line 1:
 +* Cuneiform
 +:Pros
 +:*Free
 +:*Mostly open source. ​ The rest to be released in the future.
 +:Cons
 +:*Currently Windows only, but is being ported to Mac and Linux in the future. ​ Work on this seems to have stalled.
 +:*Text output only, no searchable PDF
 +:*A lot of the documentation is in Russian
 +
 +* GOCR (JOCR)
 +:Pros
 +:*Free
 +:*Open source
 +:Cons
 +:*Text output only
 +:*Images are converted to PBM, PGM, PPM.  Is this a lossy conversion?
 +
 +* OCRAD
 +:Pros
 +:*Free
 +:*Open source
 +:Cons
 +:*Text output only
 +:*PBM, PGM, and PPM images only
 +
 +* Expervision (This has a trial version which I have not been able to get yet)
 +:Pros
 +:*Exports to searchable PDF
 +:*Has an SDK for use with C/C++
 +:Cons
 +:*Royalty fees for licensing model
 +:*Says it is compatible with all operating systems, but demo information says Visual C++ is required?
 +
 +*<span style="​color:​blue;">​Microsoft Office Digital Imaging (I have not found a computer with this installed yet)</​span>​
 +:Pros
 +:*Comes free with some/all versions of MS office on Windows. ​ It is an optional install, so many computers do not have it installed.
 +:*I have read that text coordinates can be obtained from MODI.
 +:*Takes TIFF images
 +:Cons
 +:*Windows only
 +
 +* ReadSoft
 +:ReadSoft is geared more toward businesses looking for ways to automate large-scale document processing for managing and organizing data.  OCR is only a small part of what they do.  In talking with them, it doesn'​t seem like they have a product that is very specific to what we are doing.
 +
 +*<span style="​color:​blue;">​SimpleOCR (I have not demo-ed yet)</​span>​
 +:Pros
 +:*Freeware version including command line version and SDK
 +:​*Documentation says it can return coordinates of recognized words and images
 +:*Takes TIFF and other images
 +:Cons
 +:*Windows only
 +:*Does not appear to output PDF
 +
 +* PDF OCR X
 +:Pros
 +:* Free version and pay ($30)"​Enterprise"​ version. ​ Free version restricts PDFs to one page.
 +:* Takes TIFF files.
 +:Cons
 +:*Only text output
 +
 +* NovoDynamics
 +:Pros
 +:​*Professional version creates searchable PDFs
 +:Cons
 +:*Focused mainly on Middle-Eastern & Asian languages. ​ Works on "​Embedded English."​
 +:​*Expensive. ​ Standard version costs $1300. ​ Professional version is "call for pricing."​
 +:*No demo available (at least not to me, perhaps if I were a better potential customer?)
 +
 +* MoreData/​MoreDataFast (this is based on tesseract)
 +:Pros
 +:*Free
 +:Cons
 +:*Windows only
 +:*Text output only
 +:​*Documentation in Italian
 +
 +* BrainWare
 +:Brainware is a product like ReadSoft, geared toward the bigger picture of automating data management. ​ OCR is only a small piece of what they do.
  
nlp-private/ocr-engine-pros-and-cons.txt ยท Last modified: 2015/04/23 19:38 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0