Data Labeling By Users
STYLE
Name: Your Name Here
Source: Name of the resource url (text)
Description: 1-2 sentences about the data labeling resouce
Data labeling by users
Name: Gail Carpenter
Source: The ESP game http://espgame.org/cgi-bin/login
Description: In the ESP game, players label images, with the game set up so that only labels that are novel and are chosen by both online players win points. Over 33 million image labels have been collected since October 5, 2003. A user can search for images via http://www.captcha.net/esp-search.html. E.g., "brain" yields http://www.captcha.net/cgi-bin/search . Note that these images are labeled via a different process from (e.g.) Google, which uses nearby text, not direct labeling.
Name: Gail Carpenter
Source: Amazon Mechanical Turk http://www.mturk.com/mturk/welcome
Description: From CNS alum Gary Bradski:
Amazon has a service called Mechanical Turk, named for the famous fake chess playing machine.
The idea is that they harness 1000's of people for "AI" type tasks. I wish I had something like this back in grad school days -- you can get ground truth on massive amounts of data very cheaply. It is easy to get labels and decisions on this service with no programming. To capture mouse clicks etc involves writing a program. You might want to use this to create ART training sets for example, or label words from sounds or images, or to vet how a classifier or segmentation routine is doing. The key is, for around $100, you can get 10,000 human decisions. This fits well within grant monies etc. <http://Amazon.com>Amazon.com calls it "Artificial Artificial Intelligence". Using their API, a program can actually call 1000's of people to answer questions in real time -- the people become the sub-routine.
The main point is, for academic or business needs -- If you or suitable students don't know how to use this, you are missing a tool that could vault you ahead in data sets or data analysis. I recently used it to collect 5000 images of people from video shots. It took 30 minutes hours and cost only $50.00. The command line interface is easiest to use to upload large scale data queries by just filling out XML forms. The programming interface is fairly solid now and easy to use through Java.
Name: Gail Carpenter
Source: LabelMe http://labelme.csail.mit.edu/
Description: Image labeling resource from MIT CSAIL.