2003

CAPTCHA

CAPTCHAs are tests administered by a computer to distinguish a human from a bot, or a piece of software that is pretending to be a person. They were created to prevent programs (more correctly, people using programs) from abusing online services that were created to be used by people. For example, companies that provide free email services to consumers sometimes use a CAPTCHA to prevent scammers from registering thousands of email addresses within a few minutes. CAPTCHAs have also been used to limit spam and restrict editing to internet social media pages.

CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. The term was coined in 2003 by computer scientists at Carnegie Mellon; however, the technique itself dates to patents filed in 1997 and 1998 by two separate teams at Sanctum®, an application security company later acquired by IBM, and AltaVista that describe the technique in detail.

One clever application of CAPTCHAs is to improve and speed up the digitization of old books and other paper-based text material. The ReCAPTCHA program takes words that are illegible to OCR (Optical Character Recognition) technology when scanned and uses them as the puzzles to be retyped. Licensed to Google, this approach helps improve the accuracy of Google’s book-digitizing project by having humans provide “correct” recognition of words too fuzzy for current OCR technology. Google can then use the images and human-provided recognition as training data for further improving its automated systems.

As AI has improved, the ability of a machine to solve CAPTCHA puzzles has improved as well, creating a sort of arms race, as each side tries to improve. Different approaches have evolved over the years to create puzzles that are hard for computers but easy for people. For example, one of Google’s CAPTCHAs simply asks users to click a box that says “I am not a robot”—meanwhile, Google’s servers analyze the user’s mouse movements, examine the cookies, and even review the user’s browsing history to make sure the user is legitimate. Techniques to break or get around CAPTCHA puzzles also drive the improvement and evolution of CAPTCHA. One manual example of this is the use of “digital sweatshop workers” who type CAPTCHA solutions for human spammers, reducing the effectiveness of CAPTCHAs to limit the abuse of computer resources.

SEE ALSO The Turing Test (1951), First Internet Spam Message (1978)

CAPTCHAs require human users to enter a series of characters or take specific actions to prove they are not robots.