There are always some papers that is not OCRed well. And more, I have tried to take these papers to do the 'OCR text recognization' once more with acrobat. I am frustrated, and the acrobat gave me a error, 'This page contains renderable text'. Googled it, there is
a troubleshoot, but it did not work to me. Although there is still not a clear solution, thanks
the post, which gives me a well explanation about the renderable text.
Finally, I found a solution that is quite simple:
- Export all images of the troubled pdf in TIFF format;
- Combine these images into a new pdf;
- OCR the new pdf