File Under: APIs, Web Apps

Google Docs Can Now Convert Images and PDFs to Text

Google’s web-based document editor can now convert the text inside your PDFs and images into text you can edit.

When you upload a file to Google Docs, you’ll see the option to “Convert text from PDF or image files to Google Docs documents.” You can upload any PDF, PNG, JPG or GIF.

To do the conversion, Google is relying on a technology commonly known as Optical Character Recognition, or OCR. The company began using OCR for web searches in 2008, then released experimental support for OCR-based conversion as part of its Documents List Data API in 2009.

Google has been improving the technology since then, and this is its first appearance in a Google product. Of course, since it’s part of the API, you can roll it into an app of your own creation. But we can expect the conversion tool to improve and yield some pretty cool applications down the road.

It’s not perfect, and the results will vary based on the resolution or visual clarity of whatever you’re uploading.

We converted Mark Klein’s public declaration from the AT&T/NSA wiretapping case. Here’s the original PDF from the Electronic Frontier Foundation, and here’s our Googlefied MS Word .doc file.

The cleaner the layout and the text rendering, the cleaner the result.

Below is a screenshot of Wired magazine’s iPad app, followed by the Google Docs Wired_iPad_app. You’ll notice it had some problems with the pullquote and the hyphens, but it navigated the two-column layout pretty well.

Images are a little iffy. Of course, the higher resolution and the more well-lit your image, the better the results. And you can upload just about any high res image or long PDF, since Google Docs’ file size cap for these file types is a generous 1024MB. Note that 1024MB is also the storage limit for a free Google Docs account.

The quality is about as good as our other favorite OCR-capable web application, Evernote. Based on our tests however, Evernote seems to be better at lifting text out of images taken with a camera. Evernote can also read script typefaces, which Google’s OCR engine cannot. We gave Google Docs an image of the famous Jack Daniel’s Old No. 7 whisky label, which uses a mix of fancy script and plain block text, and it was only able to convert the more traditionally-styled bit at the bottom that lists the distillery’s address.

See Also: