Introduction, Nuance OmniPage and ABBYY FineReader
Optical character recognition (OCR) uses digital imaging devices and software to read text on hard-copy documents or in digital files that are rendered as images. The functionality can then be used to create digital, editable files.
The process can be used for a variety of purposes, such as scanning hard-copy forms or making PDFs editable, but it is perhaps most useful for businesses that use a lot of paper documentation or have a lot of historical documentation that needs digitising.
Without OCR, digitising hard-copy documents would be a manual process, with businesses employing individuals to input hard-copy data into a system. Not only is this time consuming and expensive, it has the potential for human error. OCR reduces the amount of human work required and therefore helps to minimise the cost of digitising documents. Over time, it has also become increasingly accurate and capable, meaning that errors are also minimised.
There are a variety of OCR packages available and they are pitched at different levels, and have different purposes. It's important, therefore, that businesses have a good idea of what they require from an OCR package before making any decision.
This article provides an overview of some of the most popular packages on the market. It gives a description of each and should provide a good starting point for businesses looking to purchase an OCR package.
Nuance OmniPage Ultimate
Price: £169.99 (around US$270, AU$310)
Nuance is a provider of voice and language solutions for businesses and consumers. The firm is based in Massachusetts, US, and employs around 12,000 people in over 35 offices across the world. Its Dragon voice recognition software is regarded as the industry leader, and the company also produces voice-based documentation software solutions for the healthcare industry. Nuance also produces the OmniPage suite.
OmniPage Ultimate is a document scanning and conversion package. It is aimed at business professionals, small businesses and workgroups that process, distribute and store paper or PDF documents.
The package provides a means of employing a number of different devices on a network to scan documents to a local computer or central server. It allows users to scan high volumes of documents and turn hard-copy forms, images and PDFs into editable digital files.
Amongst the benefits OmniPage Ultimate offers are a high level of character recognition accuracy, the ability to keep documents formatted exactly as they were, the option to capture text with a digital camera or smartphone camera, recognition of over 120 different languages, and support for a wide range of formats and applications including HTML, Corel, WordPerfect and Microsoft Office.
ABBYY FineReader Professional
Price: £99 (around US$160, AU$180)
ABBYY was founded in 1989 as BIT Software, and renamed in 1997. The company creates artificial intelligence technologies, products and services to extract information from sources in which it would be otherwise digitally inaccessible. Amongst its products and services are dictionary tools, translation and business card reading.
ABBYY FineReader Professional converts paper and image documents into editable digital formats, such as DOC and PDF files. The software uses what ABBYY calls Advanced Adaptive Document Recognition Technology to accurately translate a document's formatting and page structure. It is able to pick out text from digital photographs and it also supports the recognition of over 190 different languages, which ABBYY says is more than any other OCR package on the market.
FineReader has built-in text verification and editing tools that are aimed at reducing the amount of editing and number of corrections required after documents have been processed. It is also able to create mobile-friendly versions of documents for use with e-book readers, tablets and smartphones. FineReader has been updated to fit the Windows 8 look and feel, and allows users to easily save output files to cloud services such as Dropbox and Google Drive. It is available for both Windows and Mac.
IRIS Readiris, Creaceed Prizmo and CVision Maestro
IRIS Readiris Pro
Price: US$129 (around £80, AU$145)
IRIS seeks to help its customers better manage their documents, data and information. The company is owned by Canon and works with a number of technologies, including intelligent document recognition, document, content and process management, and optimised IT infrastructure. IRIS and its products have won a variety of awards and media recommendations.
Readiris Pro is more basic than some of the other OCR packages on the market, aimed simply at providing users with the ability to convert image, paper and PDF files into editable and searchable files. It is designed to work with normal scanners and will output a variety of digital files including DOC, XLS, PDF and HTML. It also provides a simple function that re-renders locked PDFs to be more searchable whilst looking exactly the same, and it boasts the ability to compress document sizes by up to 50 times without reducing their visual quality.
The Readiris Cloud Connector allows users to store and manage their documents in the cloud. Supported services are Evernote, Dropbox, Box and Google Drive. Documents can be automatically exported and then accessed from any device wherever the user is. In addition to these features, Readiris is capable of processing multi-page documents and supports the recognition of over 130 languages.
Price: From $49.95 (around £30, AU$57)
Creaceed is a Belgian company that was founded in 2008. It has a small team of four people and produces a variety of different apps for iOS and Mac. Its iOS apps include a video stabiliser and voice control tool. For Mac the company produces HDR imaging, image and video morphing and video toolbox apps. It also produces Prizmo.
Prizmo aims to provide a universal scanning experience for Mac owners. Users can use scanners, digital cameras or smartphones to take images of the document they need to digitise. If a picture is taken with a device connected to the user's computer, it will be automatically imported into Prizmo. Prizmo then allows users to extract text from hard-copy documents and extract information from business cards. Users can edit outputs to make sure the results are fully accurate.
As well as the multi-page processing, perspective correction, page-curvature correction and text-to-speech features provided by Prizmo, the software also offers additional functionality via its Pro-Pack add-on. The Pro-Pack offers batch processing, automated actions and custom export scripts.
Prizmo comes with support for 10 built-in languages, which are English, French, German, Dutch, Italian, Spanish, Portuguese, Swedish, Danish and Norwegian. It also offers support for a further 30 languages that are available to download for free.
CVision Technologies is a provider of document automation solutions. Amongst the software it deals in are file compression, recognition technology, PDF workflow applications, and document automation technology tools. CVision's Maestro provides batch automated OCR with what the company claims are the most accurate results available.
CVision says Maestro offers a number of advanced OCR functionalities, including the ability to identify text within low resolution captured documents, to process documents containing multi-directional text, and documents containing low-contrast colour text.
According to CVision, Maestro can be integrated into existing document and imaging workflows. It can process up to 20 pages per second, supports inputs in 11 different formats and can generate outputs in 10 different formats. Maestro is used by a number of major organisations including Barclays, RBS and Xerox.