• Cuneiform

:Pros :*Free :*Mostly open source. The rest to be released in the future. :Cons :*Currently Windows only, but is being ported to Mac and Linux in the future. Work on this seems to have stalled. :*Text output only, no searchable PDF :*A lot of the documentation is in Russian

• GOCR (JOCR)

:Pros :*Free :*Open source :Cons :*Text output only :*Images are converted to PBM, PGM, PPM. Is this a lossy conversion?

:Pros :*Free :*Open source :Cons :*Text output only :*PBM, PGM, and PPM images only

• Expervision (This has a trial version which I have not been able to get yet)

:Pros :*Exports to searchable PDF :*Has an SDK for use with C/C++ :Cons :*Royalty fees for licensing model :*Says it is compatible with all operating systems, but demo information says Visual C++ is required?

• <span style=“color:blue;”>Microsoft Office Digital Imaging (I have not found a computer with this installed yet)</span>

:Pros :*Comes free with some/all versions of MS office on Windows. It is an optional install, so many computers do not have it installed. :*I have read that text coordinates can be obtained from MODI. :*Takes TIFF images :Cons :*Windows only

:ReadSoft is geared more toward businesses looking for ways to automate large-scale document processing for managing and organizing data. OCR is only a small part of what they do. In talking with them, it doesn't seem like they have a product that is very specific to what we are doing.

• <span style=“color:blue;”>SimpleOCR (I have not demo-ed yet)</span>

:Pros :*Freeware version including command line version and SDK :*Documentation says it can return coordinates of recognized words and images :*Takes TIFF and other images :Cons :*Windows only :*Does not appear to output PDF

• PDF OCR X

:Pros :* Free version and pay ($30)“Enterprise” version. Free version restricts PDFs to one page. :* Takes TIFF files. :Cons :*Only text output • NovoDynamics :Pros :*Professional version creates searchable PDFs :Cons :*Focused mainly on Middle-Eastern & Asian languages. Works on “Embedded English.” :*Expensive. Standard version costs$1300. Professional version is “call for pricing.” :*No demo available (at least not to me, perhaps if I were a better potential customer?)

• MoreData/MoreDataFast (this is based on tesseract)

:Pros :*Free :Cons :*Windows only :*Text output only :*Documentation in Italian

• BrainWare

:Brainware is a product like ReadSoft, geared toward the bigger picture of automating data management. OCR is only a small piece of what they do.