Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing
Oleson, David (CrowdFlower) | Sorokin, Alexander (CrowdFlower) | Laughlin, Greg (CrowdFlower) | Hester, Vaughn (CrowdFlower) | Le, John (CrowdFlower) | Biewald, Lukas (CrowdFlower)
Crowdsourcing is an effective tool for scalable data annotation in both research and enterprise contexts. Due to crowdsourcing’s open participation model, quality assurance is critical to the success of any project. Present methods rely on EM-style post-processing or manual annotation of large gold standard sets. In this paper we present an automated quality assurance process that is inexpensive and scalable. Our novel process relies on programmatic gold creation to provide targeted training feedback to workers and to prevent common scamming scenarios. We find that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results.
Aug-8-2011
- Country:
- North America > United States
- California > San Francisco County
- San Francisco (0.15)
- Colorado > La Plata County
- Durango (0.04)
- New York > New York County
- New York City (0.05)
- North Carolina > Pitt County
- Greenville (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Washington > King County
- Seattle (0.04)
- California > San Francisco County
- North America > United States
- Industry:
- Materials > Metals & Mining > Gold (0.79)
- Technology: