Monday, January 30, 2012

Human Computation - ReCAPTCHA and Duolingo

CAPTCHA is an internet safeguard against automated form-filling programs. Created by Luis Von Ahn, an associate professor at Carnegie Mellon University, CAPTCHAs consist of randomly generated character images that only a human user can decipher in order to submit online forms. With 200 million CAPTCHAs used every day, Professor Ahn realized that he could use this large amount of “human labor” to solve large-scale problems. As a result, ReCAPTCHA was invented to help digitize books by having one of the random character images be a scanned word from an actual book. This showed Ahn that the internet is a very powerful tool to coordinate people’s minimal contributions for surprisingly useful goals. This led him to the idea of using millions of people to translate the web for free. Instead of paying for language software, people can learn foreign languages and practice translating real web content on his free website, Combining multiple unprofessional user translations actually results in an accurate translation of websites, which can speed up the process of translating and spreading information worldwide. This is a win-win strategy and this network of internet users is key towards the project’s success.

This innovative project is a great example of network economics, a topic that is going to be covered in class. Wikipedia defines network economics as, “ business economics that benefit from network effects”. In other words, it’s a business model where the number of users affects how much value a product has for others. uses this network effect and appeals to two types of users. One type of user is language learners who want to learn a foreign language for free. The other types of user are the worldwide readers and businesses that would like to have website content translated into multiple languages. As more Duolingo users participate, more of the internet can be translated accurately, economically, and efficiently.  This helps build and strengthen the program to help achieve its overall goal of increased international accessibility and connectivity over the web. This is a very interesting and useful business model, and the internet serves as a vital platform for this model. The internet provides a fast way to access a variety of information, and is a powerful tool for combining user-input with computer learning. This project is a great example of how the internet can help us create useful economic cycles and generate benefits for society. 


  1. Crowd-sourcing = Awesome. I find it disappointing that we haven't brought crowd-sourcing to the highest intellectual levels. For example, for programming problems, there is StackOverflow; for physics questions, there is physicsforums; why is there no forum for P=NP? (Or maybe there is, but from my understanding, scientific collaborations number in the less than 10s whereas the number of people working on similar problems that could benefit from the insight of others numbers in the 10,000s or more).

    To speculate, I think Quora's method of introducing you content that people you 'follow' have read or liked could lead (with some additional network exploitation) to global collaborative efforts. How many degrees away can all particle physicists possibly be?

    1. Actually, there are examples of "crowd sourcing for science" starting to pop up. Within CS theory, there's the "TCS stack-exchange" for talking about issues in proofs, etc:

      And, within math there have recently been some "polymath" projects organized by Field's medalist Terence Tao that attempt to bring 100s+ mathematicians together to solve one important problem