We have all seen websites that require us to type the odd shaped letters and numbers in a box with a patterned background of some sort in order to gain access to a website or account on the web.
A box like this on the web is called a CAPTCHA which is an acronym for the phrase Completely Automated Turing test to tell Computers and Humans Apart. CAPTCHAs are used by websites to check to see if the request being made (such as opening an email account) is from a real human or a bot (a web robot which is a software application, frequently used maliciously by hackers, spammers and other web low lifes). Humans can read and correctly reproduce the CAPATCHAs while bots and other software running on computers can't.
It is estimated that the use of a CAPTCHA adds about 10 seconds to the time it takes to access a website but, given the fact that it helps to reduce Spam and other malware, most of us consider this as ten seconds well spent. However, it is further estimated that, when added up worldwide, each of those ten second delays amounts to 150,000 hours or more per day devoted to solving CAPATCHAs. For the individual user, the time spent is practically unnoticeable, but, in aggregate we are talking about a lot of hours.
So what, you say?
Well, suppose we could begin to channel these ten second delays into some other useful endeavor in addition to thwarting the schemes of Internet crooks? 150,000 hours is the equivalent of 18,750 people working a standard 8 hour day. At the current U.S. Federal minimum wage of $5.15 per hour this would cost an employer $96,562.50 PER DAY in wages alone, not counting employment taxes, benefits and other costs associated with hiring employees.
While the idea of harvesting and putting to use the 150,000 hours per day spent by millions of people in little ten second bursts is attractive, the question is what thing of practical use can a person produce in these random ten seconds spent typing the key to access some site on the web?
It just so happens that on May 24, 2007, Ben Maurer, an undergraduate computer science student at Carnegie Mellon University and his team of students and faculty not only figured out a way to harness all of these little ten seconds worth labor, but launched a service to make productive use of this labor.
Anyone who has attempted to scan text into a computer and save it in a text format rather than a graphic format, knows that not every word or phrase is converted correctly. A change in the font, marks on the page, a change in handwriting style, etc. will all cause some words and phrases to scan incorrectly. We have programs to detect these scan errors but it requires a human to look at the original text and key the text in correctly. For the ordinary user scanning a few pages of text, this is a nuisance. But for the various companies attempting to scan whole libraries of books or Google which is attempting to scan all of the world's books, it is more than a nuisance it is hundreds of thousands of hours of human labor all of which has to be paid.
Maurer and his team looked at this and concluded that these unscanable words and phrases which require a human to interpret are no different than a CAPATCHA which also requires a human to convert. Their conclusion was why not couple these unscanable words and phrases with a CAPATCHA and let the user type both to enter a protected site. The computer has already been programmed to verify whether or not the user types the CAPATCHA correctly and, if the user does that then we can assume that the person also typed the unknown word or phrase correctly as well thereby getting the transcription of the unscanable word done for free.
Maurer and his team have launched a grant funded service called ReCapatcha which provides CAPATCHA service, with the unknown words included, free to individuals and companies needing CAPATCHA capabilities for security purposes. People typing the CAPATCHAs needed to access the sites run by participating individuals and companies will also be transcribing the unscannable text which will then be automatically routed back to the people scanning books. Of course there is a chance that a mistake will be made when typing the unscanable word or phrase but the developers feel that if an individual correctly types the standard CAPATCHA portion of the contents of the box (and the computer can verify this) that their transcription of the unscanable word or phrase will also be correct. For the individual attempting to access the page or account, their concern will be to type the whole thing correctly as they will have no way of knowing that part of the text and numbers they see is an unscanable word or phrase from a book.
For more information on ReCapatcha go to recaptcha.net on the web.
















2 min 25 sec ago
24 min 41 sec ago
29 min 1 sec ago
17 hours 11 min ago
17 hours 13 min ago
17 hours 31 min ago
18 hours 17 min ago
18 hours 18 min ago
18 hours 45 min ago
19 hours 37 min ago