The Amazing, Astonishing Google Check: How I Used Google to Spell-Check Every Word in My Book.

Posted by on January 5, 2013 in Media/Reviews | 13 comments

Anyone who uses technical terms, scientific terms, foreign words, proper nouns, or brand names in their writing knows the limitations of the spell-checker built into a word processing program.  After all, it’s just a static list of words that got loaded onto your computer and never gets updated or expanded unless you do it yourself.  (And if you’ve ever tried one of the professional spell check add-ons, like Spellex, you may have noticed that they don’t always include every possible term in your field–I found the botanical checker particularly lacking–in which case, you still can’t be confident that you’ve caught everything.)

My latest book, The Drunken Botanist, is packed with weird, tricky words.  On a single page, I might mention the name of a flavor molecule, the Latin name of a plant, the surname of the French botanist who discovered it, and the brand name of a liqueur that is flavored by that plant.

That goes on for 400 pages.  You cannot imagine what a chore it was to proofread this book, and the level of sobriety required for the task.

After the completed, polished, edited, spell-checked manuscript had been proofread at least three times by me, my editor, a professional copy editor, a professional proofreader, a few other people I probably don’t even know about, and been read closely by a few smart friends and relatives, I got the pages back one last time for a final check.  It had already been typeset by then, so I got it as one long PDF.

Every time I saw a tricky term that didn’t look right, I double-clicked the word, copied it, and pasted it into Google to check.  Google, as you may know, is a surprisingly useful spell-checker:  if you get a word wrong, you’ll probably get “Did you mean…” right under the search term.  Even if that doesn’t happen, Google will generally take you to a variety of well-respected sources (or, in the case of a brand name, the company’s website) to help you check the spelling. It even catches pop culture terms, and it snags some context-specific stuff (for instance, if you wrote “hear” instead of “here”) And–bonus– Google is poly-lingual.

So as I was doing that, I was thinking, “I wish I could just Google the whole book.  Why can’t I do that?”

Then I realized that I could.  Google Docs (now called Google Drive) relies on Google’s search engine technology for its spell check function.

Why had I never thought of this before?  Here’s how I did it:

First, since I was working with a PDF, I copied the text and pasted it into a plain-text editor.

Once I had the whole document in Notepad, I copied chunks of it into a blank Google Docs document.  I found that there was an upper limit to how much text Google Docs could handle at once.  What worked for me was to put my cursor at a starting point in the Notepad text, then hit Page Down about 15 -20 times, and copy that much text at a time.    In my case, that worked out to about 35,000 words at a time.

Once you paste it into Google Docs, it takes a little while to process and save it–roughly 20 seconds.  At some point beyond that 20-second mark, with a larger chunk of text, it just gives up and won’t process it at all–at least, that was my experience.  So the sweet spot seems to be right about 35,000 words.

Then all you have to do is go through and right-click on any word underlined in red.  It’ll give you a “Did you mean…” suggestion for anything that looks weird to Google–including people’s names, names of foreign cities, obscure scientific terms, all of it.

And guess what?  I found an astonishing thirty-eight errors with this method.

This is after it had been through a very rigorous and professional editing process that took months and passed through many very competent hands.  A process in which we’d all discussed how important it would be to check and double-check those tricky, difficult-to-check words. We weren’t even really proofreading anymore–this was just a final, quick look-see before it went to the printer.

And yet the silliest mistakes had escaped the notice of all of us.  Most of the mistakes I found had been in the original manuscript all along. We’d all missed them.

I can tell you that I will never again publish a book without running it through Google. (and I am fighting the temptation to Google my previous books–it is only the fact that I don’t have a PDF of the final version of each previous book that is holding me back.)

It’s time-consuming — the whole process took me 12-14 hours, in part because Google flagged a lot of words that were actually correct, but I still had to slow down and double-check them– but entirely worthwhile.   I think that if I had it to do over again, I’d run the Google check twice during the editing process.

The first time would be right before I transmit the final version of the manuscript. This is the version that my editor and I have already been over at least three times and that I have spell-checked (both with the computer and with my eyeballs) many times.  Once we transmit it, I never get it back as a Word document again.  From that point on, someone else inputs the changes. And new errors can get introduced as those changes are made.

So I’d Google-check it once right before transmittal just to eliminate obvious errors and make the professional copy editor and proofreader’s jobs easier.  The fewer mistakes they have to contend with, the more likely they will be to catch all the stuff that computers don’t catch.

Then, when I got final, typeset pages, maybe at the second pass stage, I’d take the PDF and copy/paste it and do the Google check one more time.  It probably wouldn’t turn up much, but then again, I wasn’t expecting to find 38 errors this time.

The genius behind this technology appears to be a guy named Yew Jin Lim.  Dude, you are invited to Thanksgiving at my house every year, from now on. Do not be surprised if I dedicate my next book to you. Srsly.  (Google got that word right, btw. And that one.)

 

 

 

13 Comments

  1. Wow- amazing that I came across this on the VERY day we are editing our book for hopefully the last time!! Good to know- thank you!

  2. Thank you very much for this information. I am a long way from being ready to publish anything, but bitter experience makes me a slow editor!
    If you had not found this system so (relatively) painless I’d not have tried it, having been infuriated by the spell-check in my email (I tend to use a lot of “odd” and non-English words).
    When Mr. Yew comes to dinner please convey my thanks, too.

  3. Wow, this is great. Thank you for sharing!

  4. Happy New Year… Wonderful share here , thank you so much Amy I will surely share!

  5. Wish I had known about this before editing, proofreading, re-proofreading and spellchecking my book. And, yes, all of us the editor included missed several misspellings.
    Thanks for the great idea!

  6. Thanks so much Amy, this is really, really helpful info – it’s so easy to make mistakes and not spot them. I’ve just run my new course/book on Plant Design through it – it picked up on several very important things I’d missed!

  7. Great advice! Thank you very much.

  8. I can’t tell you how grateful I am for this tip! I’m half way through the manuscript of my new book and will absolutely run it through Google Drive at least once. I HATE mistakes in books – drives me insane!!! As always – your tips are most appreciated. The next drink’s on me….

  9. Gulp. Well, this makes me feel slightly better, as I also have perfectionist standards and STILl, in spite of countless eyes and proofs, there were still a few errors in my book. I think four so far. None of them were spelling errors, so this would not have helped, but I will remember for Next Time. Thanks, Amy. This post will undoubtedly go far and wide.

  10. Consider sending this hint in to Lifehacker.
    You have just saved so many people so much time – me included. Thank you!

  11. Thanks for the tip – I will use it on my botanical mystery before I submit. I can’t wait to read “The Drunken Botanist” – thank you for the amazing books you’ve written!

  12. Thanks for the heads up. I will definitely do just that when I post another article on my site.

  13. Google Docs (Drive) has many other interesting features, too. If you have a PDF that has text in it, you can actually have GDrive translate that back to a Document form completely (no copy-paste required); simply turn on both of the “Convert” options in the gear icon menu under “Upload options”, then upload the PDF. This works with many file types, including MS Word, RTF, ODF (ODT), and others.

    Since typeset formatting isn’t really reproduceable in GDrive’s simpler word processor, you’ll lose much formatting. But for the purposes of bulk spell-checking, that’s probably not important.

Trackbacks/Pingbacks

  1. Links 1/9/13 | Mike the Mad Biologist - [...] 10 Reasons Being a University Professor is a Stressful Job Tysons Corner, on the verge of a do-over The …

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>