Do you get frustrated filling out those online forms with jumbled letters to prove that you're human, only to get them wrong? They're called CAPTCHA puzzles and are designed to be difficult for computers to crack. Google's Street View technology, however, can decipher them with 99 percent accuracy.
CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. The term was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. The test is designed to tell humans and bots apart on the web for reasons such as minimizing spam.
Different CAPTCHA services go about differentiating humans and bots in different ways, for example, distorting and skewing a series of letters to make them too difficult for automated programs to decipher, but not so difficult that humans can't make them out.
reCAPTCHA, a company that Google bought in 2009, takes a different approach that not only provides the security of a CAPTCHA form, but actually serves another clever and surprising purpose. reCAPTCHA is used to digitize old printed materials such as books. It presents the user with two words that could not initially be read by computers – one that has previously been verified by a number of users and one that has not. In this way, it can both determine if a user is human and partially verify a new word.
Not only is this process completely automated, but with about 200 million CAPTCHA puzzles being solved around the world every day, it can make a substantial contribution to bringing old, pre-computer texts onto the Web. reCAPTCHA's slogan perhaps sums it up best: "Digitizing books one word at a time."
Given the purpose and prevalence of CAPTCHAs, it's understandable that individuals with malicious intent may want to crack them. Google does its own research into this so that it can improve the security of the reCAPTCHA service. As part of its research, it has applied the technology it uses for identifying house numbers in Street View to identifying CAPTCHA words.
The research, detailed in Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks, found that the technology could "decipher the hardest distorted text puzzles from reCAPTCHA with over 99 percent accuracy." As a result, Google suggests that answering a distorted image puzzle should not be the only factor used to distinguish a human from a machine.
Google says its research in this area helps it to improve the reCAPTCHA service. "Thanks to this research, we know that relying on distorted text alone isn’t enough," says Vinay Shet, Product Manager at reCAPTCHA in a blog post. "However, it’s important to note that simply identifying the text in CAPTCHA puzzles correctly doesn't mean that reCAPTCHA itself is broken or ineffective. On the contrary, these findings have helped us build additional safeguards against bad actors in reCAPTCHA."
Google isn't the only company working in this area. Last year, San Francisco-based artificial intelligence startup Vicarious developed algorithms that have a 90 percent success rate in solving CAPTCHA puzzles.
See the stories that matter in your inbox every morning