CAPTCHA is an acronym for “Completely Automated Public Turing test to tell Computers and Humans Apart”.
We’ve seen CAPTCHA codes on many websites during registering process, posting comments, writing blog replies, submit feedbacks etc.
Basically it’s a human response test to tell computers that any human being is communicating with them. It is generally used to check whether you are a robot or a human being.
The need for CAPTCHAs began as far back as 1997. At that time, the internet search engine AltaVista was looking for a way to block automated URL submissions to the platform which were skewing the search engine’s ranking algorithms. To solve the problem, Andrei Broder, AltaVista’s chief scientist, developed an algorithm that randomly generated an image of printed text. Although computers couldn’t recognize the image, humans could read the message the image contained and respond appropriately. Broder and his team were issued a patent for the technology in April 2001.
In 2003, Nicholas Hopper, Manuel Blum, Luis von Ahn of Carnegie Mellon University and John Langford of IBM perfected the algorithm and coined the term CAPTCHA.
Captcha is generated by distorting an image with text/numbers, in such a way that any OCR technology fails, and only a human eye can read and make sense. Till date, there are no good automated captcha solving algorithm.
However, one can notice that captcha are getting harder as time progresses. This is because, usage of latest advanced pattern recognition and machine learning algorithms are capable of solving simpler captcha, so the latter should be in a position to defeat the former.
The reasoning behind why websites implement CAPTCHA codes into their registration processes is because of spam. Those crazy letters are a way to check if the person registering or trying to comment is a real live human being as opposed to a computer program attempting to spam the site. Yes, it’s the same reason most of us have some form of spam blocker on our email.
To prevent bots from overrunning sites with spam, fraudulent registrations, fake sweepstakes entries, and other nefarious things, publishers responded by testing users to see if they were human or not.
How CAPTCHAs work?
CAPTCHAs are a kind of Turing test. Quite simply, end users are asked to perform some task that a software bot cannot do. Tests often involve JPEG or GIF images, because while bots can identify the existence of an image by reading source code, they cannot tell what the image depicts. Because some CAPTCHA images are difficult to interpret, users are usually given the option to request a new test.
Types of CAPTCHAs
The most common type of CAPTCH is the text CAPTCHA, which requires the user to view a distorted string of alphanumeric characters in an image and enter the characters in an attached form. Text CAPTCHAS are also rendered as MP3 audio recordings to meet the needs of the visually impaired. Just as with images, bots can detect the presence of an audio file, but only a human can listen and know the information the file contains.
Picture recognition CAPTCHAs, which are also commonly used, ask users to identify a subset of images within a larger set of images. For instance, the user may be given a set of images and asked to click on all the ones that have cars in them.
Other types of CAPTCHAs include:
- Math CAPTCHAs – require the user to solve a basic math problem, such as adding or subtracting two numbers.
- 3D Super CAPTCHAs – require the user to identify an image rendered in 3D.
- I am not a robot CAPTCHA – requires the user to check a box.
- Marketing CAPTCHAs – require the user to type a particular word or phrase related to the sponsor’s brand.
Applications of CAPTCHAs
CAPTCHAs have several applications for practical security, including (but not limited to):
- Preventing Comment Spam in Blogs. Most bloggers are familiar with programs that submit bogus comments, usually for the purpose of raising search engine ranks of some website (e.g., “buy penny stocks here”). This is called comment spam. By using a CAPTCHA, only humans can enter comments on a blog. There is no need to make users sign up before they enter a comment, and no legitimate comments are ever lost!
- Protecting Website Registration. Several companies (Yahoo!, Microsoft, etc.) offer free email services. Up until a few years ago, most of these services suffered from a specific type of attack: “bots” that would sign up for thousands of email accounts every minute. The solution to this problem was to use CAPTCHAs to ensure that only humans obtain free accounts. In general, free services should be protected with a CAPTCHA in order to prevent abuse by automated scripts.
- Protecting Email Addresses from Scrapers. Spammers crawl the Web in search of email addresses posted in clear text. CAPTCHAs provide an effective mechanism to hide your email address from Web scrapers. The idea is to require users to solve a CAPTCHA before showing your email address. A free and secure implementation that uses CAPTCHAs to obfuscate an email address can be found at reCAPTCHA MailHide.
- Online Polls. In November 1999, http://www.slashdot.org released an online poll asking which was the best graduate school in computer science (a dangerous question to ask over the web!). As is the case with most online polls, IP addresses of voters were recorded in order to prevent single users from voting more than once. However, students at Carnegie Mellon found a way to stuff the ballots using programs that voted for CMU thousands of times. CMU’s score started growing rapidly. The next day, students at MIT wrote their own program and the poll became a contest between voting “bots.” MIT finished with 21,156 votes, Carnegie Mellon with 21,032 and every other school with less than 1,000. Can the result of any online poll be trusted? Not unless the poll ensures that only humans can vote.
- Preventing Dictionary Attacks. CAPTCHAs can also be used to prevent dictionary attacks in password systems. The idea is simple: prevent a computer from being able to iterate through the entire space of passwords by requiring it to solve a CAPTCHA after a certain number of unsuccessful logins. This is better than the classic approach of locking an account after a sequence of unsuccessful logins, since doing so allows an attacker to lock accounts at will.
- Search Engine Bots. It is sometimes desirable to keep webpages unindexed to prevent others from finding them easily. There is an html tag to prevent search engine bots from reading web pages. The tag, however, doesn’t guarantee that bots won’t read a web page; it only serves to say “no bots, please.” Search engine bots, since they usually belong to large companies, respect web pages that don’t want to allow them in. However, in order to truly guarantee that bots won’t enter a web site, CAPTCHAs are needed.
- Worms and Spam. CAPTCHAs also offer a plausible solution against email worms and spam: “I will only accept an email if I know there is a human behind the other computer.” A few companies are already marketing this idea.
And at last but not list…
What is reCAPTCHA?
reCAPTCHA is a free service that protects your website from spam and abuse. reCAPTCHA uses an advanced risk analysis engine and adaptive CAPTCHAs to keep automated software from engaging in abusive activities on your site. It does this while letting your valid users pass through with ease.
reCAPTCHA offers more than just spam protection. Every time our CAPTCHAs are solved, that human effort helps digitize text, annotate images, and build machine learning datasets. This in turn helps preserve books, improve maps, and solve hard AI problems.