Optical Character Recognition
A common method for making PDF documents is to scan the paper copies with a scanner, and then viewing the new PDF document with application software such as Adobe Acrobat. This method is satisfactory if you will only ever want to view the document as an image, but with a different approach your documents could become a lot more accessible and versatile through optical character recognition (OCR) technology. OCR software works with your scanner to covert printed characters into digital text, allowing your document to be editable and searchable. This has numerous benefits over scanning documents as images:
Fast digital searching. OCR software converts scanned text into a work processing file that can be searched by key words or specific phrases.
Editable. OCR documents can be edited which makes it much easier to revise and updates documents such as forms, resumes, and contracts.
Saving space. Converting paper documents to digital documents using OCR software means that original paper copies can be discarded, saving space in your office.
Accessibility. One of the most important benefits of using OCR software to convert paper documents to searchable and editable digital documents is accessibility. ADA compliance is a federal and state mandate assuring equal access to government sponsored facilities for those with disabilities. A scanned document that is saved as an image cannot be deciphered by screen-reading software used by the visually impaired, meaning that the document is not ADA compliant. OCR software can also improve accessibility for the public or certain authorized individuals who can now search through your OCR document archives by keywords or phrases, instead of requiring them to sift through endless PDFs or paper files to find the documents that they are seeking.
Make your OCR software work its best by following these tips:
Make sure you only scan the following types of documents:
- Printed text sizes from 6 points (.08 inches) to 72 points (1 inch)
- Types text
- Laser and inkjet printed text
- Letter-quality dot matrix print
- Printed materials such as papers, books, magazines, brochures, etc.
- Faxes wit resolution greater than 200 dpi.
For quality scans avoid:
- Documents with extremely stylized text or handwriting
- Documents with blurry text
- Documents that were scanned upside down or sideways
- Documents with lines crossed out or stained
- Documents with font spacing that is tight and the letters run together
Hosting options for OCR documents can include cloud OCR or self-hosting. Some OCR software solutions include features such as batch splitting- inserting blank pages to separate each document in a stack, automatic blank page removal, and image enhancements that can correct and straighten your images.