List of optical character recognition software
Encyclopedia
An OCR SDK is a software development kit
Software development kit
A software development kit is typically a set of software development tools that allows for the creation of applications for a certain software package, software framework, hardware platform, computer system, video game console, operating system, or similar platform.It may be something as simple...

 for adding optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 capabilities to forms processing applications, document imaging management systems, e-discovery systems and records management solutions.

In order to avoid the difficulties of incorporating OCR technology, some OCR SDKs contain a high number of APIs, support multiple operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s and programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s.

Here is a non-exhaustive comparison of optical character recognition software:
Name Founded year Latest stable version Release year License Online Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

BSD
Berkeley Software Distribution
Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

Programming language SDK? Languages Fonts Notes
ABBYY FineReader  1989 11 2011 C/C++ 186 ABBYY also supplies SDKs for embedded or mobile devices. Professional, Corporate and Site License Editions for Windows, Express Edition for Mac.
AnyDoc Software
AnyDoc Software
AnyDoc Software, founded in 1989 as Microsystems Technology, Inc., is a company based in Tampa, Florida that develops, sells, installs, and supports enterprise content management software which captures data from scanned documents or images into machine-readable text for back-office applications...

 
1989 VBScript Works with structured, semi-structured, and unstructured documents.
CuneiForm
CuneiForm (software)
In computer software, CuneiForm is an OCR tool. It was originally developed at Cognitive Technologies and, after a few years with no development, released as freeware on December 12, 2007. The kernel of OCR engine was released under the open source BSD license license at the beginning of April...

/OpenOCR 
12 2007 C/C++ 28 Any printed font Enterprise-class system, can save text formatting and recognizes complicated tables of any structure
ExperVision
ExperVision
ExperVision, Inc is a technology company in California founded in 1987 whose main product is optical character recognition systems. It is now owned by ExperExchange, Inc., but retains the trading name ExperVision....

 TypeReader
TypeReader
Expervision TypeReader is an Optical Character Recognition software application developed by Expervision.TypeReader converts scanned documents into electronic files at speed of 8,000 pages per hour with maximum reliability...

 & RTK
1987 7.1.170.1125 2010 C/C++ 17 2618 Won the highest marks in the independent testing performed by UNLV for X consecutive years (in 1994).

The speed of ExperVision’s OpenRTK is four to eight times faster than competition. — PC Magazine
PC Magazine
PC Magazine is a computer magazine published by Ziff Davis Publishing Holdings Inc. A print edition was published from 1982 to January 2009...


but also "Not as accurate as rival products, clumsy interface, limited options for proofreading, couldn't open some files in standard PDF or image formats." PC Magazine
PC Magazine
PC Magazine is a computer magazine published by Ziff Davis Publishing Holdings Inc. A print edition was published from 1982 to January 2009...

GOCR
GOCR
GOCR is a free optical character recognition program, initially written by Jörg Schulenburg. It can be used to convert or scan image files into text files.- Features :...

 
0.47 2009 C
LEADTOOLS  1990 17 2010 various 56 Any printed font Supports Latin, Asian, Arabic, and MICR character sets. For full page, zonal, and form image processing. Includes OCR, barcode, OMR and forms recognition. ICR (handwritten text recognition) is supported.
Java OCR  Java OCR 2010 Uses Java
Microsoft Office Document Imaging
Microsoft Office Document Imaging
Microsoft Office Document Imaging is a Microsoft Office application that supports editing documents scanned by Microsoft Office Document Scanning. It was first introduced in Microsoft Office XP and is included in later Office versions including Office 2007. It is no longer available in Office 2010...

 
Office 2007 2007 Uses OmniPage
Microsoft Office OneNote 2007  2007 2007
Ocrad
Ocrad
Ocrad is an optical character recognition program, developed as part of the GNU Project. Like all GNU software it is free software, and is licensed under the GNU GPL....

 
0.20 2010 C++ Latin alphabet Command line
OCRopus
OCRopus
OCRopus is a free document analysis and optical character recognition system released under the Apache License, Version 2.0 with a very modular design through the use of plugins...

 
0.3.1 2008 C++ and Lua Pluggable framework which can use Tesseract
OCRFeeder
OCRFeeder
OCRFeeder is a free software desktop OCR suite for GNOME. It converts paper documents to digital document files or makes them accessible to visually impaired users....

 
0.7.6 2009 Python Features a full user interface and has a command-line tool for automatic operations. Has its own segmentation algorithm but uses system-wide OCR engines like Tesseract
Tesseract (software)
Tesseract is a free software optical character recognition engine for various operating systems.Originally developed as proprietary software at Hewlett-Packard between 1985 and 1995, it had very little work done on it in the following decade. It was then released as open source in 2005 by Hewlett...

 or Ocrad
Ocrad
Ocrad is an optical character recognition program, developed as part of the GNU Project. Like all GNU software it is free software, and is licensed under the GNU GPL....

OmniPage
OmniPage
OmniPage is an optical character recognition application available from Nuance Communications.OmniPage was one of the first OCR programs to run on personal computers....

 
2005 18 2011 C/C++/C# Product of Nuance Communications
Nuance Communications
Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

Puma.NET
Puma.NET
Puma.NET is an open source OCR SDK project for Microsoft Windows platform available under BSD license. The project is oriented on software developers working with Microsoft.NET Framework and is aimed to provided newly developed applications with OCR capabilities. Puma.NET is a wrapper for...

 
C# 28 Any printed font .NET
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...

 OCR SDK
Software development kit
A software development kit is typically a set of software development tools that allows for the creation of applications for a certain software package, software framework, hardware platform, computer system, video game console, operating system, or similar platform.It may be something as simple...

 based on Cognitive Technologies' CuneiForm recognition engine. Wraps Puma COM server and provides simplified API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

 for .NET applications
Readiris
Readiris
Readiris is optical character recognition software for Microsoft Windows and Mac OS. It is produced by Belgian company Image Recognition Integrated Systems Group S.A. I.R.I.S. Group...

 
12 Pro 2009 C++ Product of I.R.I.S. Group
I.R.I.S. Group
IRIS : Image recognition integrated systems is a computer software technology company that provides text recognition and document management solutions. IRIS is headquartered in Louvain-la-Neuve, in Belgium.-IRIS history:...

 of Belgium. Asian and Middle Eastern editions.
ReadSoft
ReadSoft
ReadSoft is a company that develops, markets and supports software that automates the processing of documents, such as invoices, in different business processes and ERP environments within organizations. ReadSoft was founded by two university students in Lund, Sweden, in 1991, both of which are...

 
Scan, capture and classify business documents such as invoices, forms and purchase orders integrated with business processes.
RelayFax
RelayFax
RelayFax is fax server software for Windows computer systems, produced by Alt-N Technologies.Available end-user interfaces include standard e-mail client, virtual printer, and dedicated software. All these interfaces result in an e-mail message being sent to a dedicated mailbox on a local or...

 
Many Converts fax
Fax
Fax , sometimes called telecopying, is the telephonic transmission of scanned printed material , normally to a telephone number connected to a printer or other output device...

ed pages into editable document formats (doc, PDF, etc...).
Scantron
Scantron
Scantron is an American company based in Eagan, Minnesota, that manufactures and sells machine-readable papers on which students mark answers to academic multiple-choice test questions. To analyze those answers, the machines use image-based data collection software and scanners...

 
Cognition For working with localized interfaces, corresponding language support is required.
SimpleOCR
SimpleOCR
SimpleOCR is a proprietary optical character recognition application developed originally by Cyril Cambien of France under the title WOCAR . It converts black and white scans or TIFF images to editable text files or Microsoft Word documents.Version 3.1, reviewed in PC Magazine in 2004, is the...

 
2002 3.5 2008
SmartScore
SmartScore
SmartScore is a music OCR and scorewriter program, developed, published and distributed by Musitek Corporation based in Ojai, California, . As of March 2010, there are over 35,000 registered users of Musitek software worldwide....

 
For musical scores
Tesseract
Tesseract (software)
Tesseract is a free software optical character recognition engine for various operating systems.Originally developed as proprietary software at Hewlett-Packard between 1985 and 1995, it had very little work done on it in the following decade. It was then released as open source in 2005 by Hewlett...

 
3.00 2010 C++, C 35+ Created by Hewlett-Packard
Hewlett-Packard
Hewlett-Packard Company or HP is an American multinational information technology corporation headquartered in Palo Alto, California, USA that provides products, technologies, softwares, solutions and services to consumers, small- and medium-sized businesses and large enterprises, including...

; under further development by Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

Transym OCR
Transym
Transym OCR is an optical character recognition engine that has been tested against the internationally recognised ISRI database. TOCR consists of the OCR engine together with a simple viewer program to connect to the engine which will handle both bitmaps and TIFF files. It has been designed...

 
3.0 2008 C#, C/C++, VB, VB.NET 11
Zonal OCR
Zonal OCR
Zonal OCR is the process by which Optical Character Recognition applications "read" specifically zoned text from a scanned image. Many batch document imaging applications allow the end user to identify and draw a "zone" on a sample image to be recognized...

 
Name Founded year Latest stable version Release year License Online Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

BSD
Berkeley Software Distribution
Berkeley Software Distribution is a Unix operating system derivative developed and distributed by the Computer Systems Research Group of the University of California, Berkeley, from 1977 to 1995...

Programming language SDK? Languages Fonts Notes
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK