
OCR Text
OCR is an abbreviation of Optical Character Recognition.
It is an automated process whereby written or printed text contained in TIF images is converted into ASCII (computer) text using specialist software.
The TIF images are obtained by scanning the original paper document.
Quality and accuracy of conversion can vary, and this would mainly be down to the quality of the scanned document. Accuracy of the OCR process is aided by high DPI (dots per inch) scan resolution, coupled with quality control of the scanned image. Software is available “off the shelf” from online and high-street PC/software retailers. There is a definite advantage of an OCR’d document compared to a basic scan image.
Once a document has been converted via OCR, the captured text becomes fully searchable. For example, although a 1,000 page document can be read without OCR, having been converted it would then be possible to search for all instances of a particular word (could be a mention of a name or company, for instance).
Text could also be made editable, especially useful if the document is to be re-published. Another benefit is that once converted to text via OCR, the page can be “re-constructed” free of any scan artifacts (dust, scratches etc.), resulting in a very clean image.
Another useful feature, Text-to-speech, also becomes possible. Examples of OCR software include ABBYY FineReader, Leadtools, Expervision and Omnipage.
Check back for the latest information, articles and reviews...
- New Articles
- News
Below are the newest articles received. Thank you to everyone that has contributed to this website.
We hope you enjoy reading:
Site redesign. Now 7 years old Document Scanning Resource has undergone a document scanning clearout and redesign. You may note a lot of older document scanning articles have gone. We felt as the industry moves so quickly some of the articles were no longer 'relevant' to the current marketplace so it was better to remove these during the redesign. If you have any questions please contact us.
