Scan document -> OCR-> Searchable PDF
I have scanned in a letter, but I want to create a PDF so that I can post it on a web site, but I also want the PDF to be 'searchable'.
I have used Microsoft Office Document Imaging and it has created a TIFF file and the OCR has been performed. But when I use doPDF, it creates the PDF as an image and not Searchable PDF. So If I do find the word "Dear", it does not find anythign because it treats it like an image.
Any solutions?
I have used Microsoft Office Document Imaging and it has created a TIFF file and the OCR has been performed. But when I use doPDF, it creates the PDF as an image and not Searchable PDF. So If I do find the word "Dear", it does not find anythign because it treats it like an image.
Any solutions?
-
- Posts: 1565
- Joined: Thu May 23, 2013 7:19 am
I don't know if MODI allows you to save in an .rtf format (or .doc) after performing OCR, but if it doesn't you could copy/paste the OCR-ed document into word and convert it with doPDF from there - this should make it searchable. When you convert TIFF to PDF it will be converted as an image, that's why you have to use the rtf solution.
Follow us to stay updated:
- Newsletter (get a discount for subscribing): https://www.dopdf.com/newsletter.html
- Facebook: https://www.facebook.com/dopdf
- Twitter: https://twitter.com/dopdf
- Linkedin: https://www.linkedin.com/showcase/dopdf
Microsoft Office Document Imaging has only option to save as TIFF or .mdi file (2003 ed). So no option to save as RTF file. However, it does have the option to save as a Word file. However, there is a problem with approach is that I loose the entire document e.g letter head. All it does it copy the text. As an example, if you got a letter from the President, you would want to retain the 'original' document e.g. preseve letter head etc...
-
- Posts: 1565
- Joined: Thu May 23, 2013 7:19 am
As an example, if you got a letter from the President, you would want to retain the 'original' document e.g. preseve letter head etc
I would say it depends which president you got that from .
Anyway, I can't think of another solution for having the scanned document converted. It would be tedious, but if you have the text saved in a word document you could include the letter head as an image in the word document (cropping it from your scan image).
I would say it depends which president you got that from .
Anyway, I can't think of another solution for having the scanned document converted. It would be tedious, but if you have the text saved in a word document you could include the letter head as an image in the word document (cropping it from your scan image).
Follow us to stay updated:
- Newsletter (get a discount for subscribing): https://www.dopdf.com/newsletter.html
- Facebook: https://www.facebook.com/dopdf
- Twitter: https://twitter.com/dopdf
- Linkedin: https://www.linkedin.com/showcase/dopdf
anon; you should look at OpenOffice.org for a free and simple solution -- whilst Sun's flexible suite isn't much to look at, it's every bit as powerful as Microsoft Office and with it's PDF editing features, even more so.
P.S. (of course, doPDF is what you'll need for all your non-office documents, eg. printing from the internet, CAD programs, etcetera
P.S. (of course, doPDF is what you'll need for all your non-office documents, eg. printing from the internet, CAD programs, etcetera
I think you have your answer in the following tutorial:
http://www.wac.ohio-state.edu/pdf/scan/pdffromscan.html
http://www.wac.ohio-state.edu/pdf/scan/pdffromscan.html