In Search of Ambiguity: A Three-Stage Workflow Design to Clarify Annotation Guidelines for Crowd Workers
Pradhan, Vivek Krishna, Schaekermann, Mike, Lease, Matthew
–arXiv.org Artificial Intelligence
While crowdsourcing now enables labeled data to be obtained more quickly, cheaply, and easily than ever before (Snow et al., 2008; Alonso, 2015; Sorokin and Forsyth, 2008), ensuring data quality remains something of an art, challenge, and perpetual risk. Consider a typical workflow for annotating data on Amazon Mechanical Turk (MTurk): a requester designs an annotation task, asks multiple workers to complete it, and then post-processes labels to induce final consensus labels. Because the annotation work itself is largely opaque, with only submitted labels being observable, the requester typically has little insight into what if any problems workers encounter during annotation. While statistical aggregation (Sheshadri and Lease, 2013; Hung et al., 2013; Zheng et al., 2017) and multi-pass iterative refinement (Little et al., 2010a; Goto et al., 2016) methods can be employed to further improve initial labels, there are limits to what can be achieved by post-hoc refinement following label collection. If initial labels are poor because many workers were confused by incomplete, unclear, or ambiguous task instructions, there is a significant risk of "garbage in equals garbage out" (Vidgen and Derczynski, 2020). In contrast, consider a more traditional annotation workflow involving trusted annotators, such as practiced by the Linguistic Data Consortium (LDC) (Griffitt and Strassel, 2016).
arXiv.org Artificial Intelligence
Dec-4-2021
- Country:
- North America > United States
- Hawaii (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- New York > New York County
- New York City (0.04)
- Texas > Travis County
- Austin (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology (0.46)
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science (0.93)
- Machine Learning (1.00)
- Natural Language (1.00)
- Representation & Reasoning (0.93)
- Communications > Social Media
- Crowdsourcing (0.92)
- Human Computer Interaction (1.00)
- Information Management > Search (0.93)
- Artificial Intelligence
- Information Technology