Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels
Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided by crowdsourcing workers is Dawid-Skene estimator. In this paper, we prove convergence rates of a projected EM algorithm for the Dawid-Skene estimator. The revealed exponent in the rate of convergence is shown to be optimal via a lower bound argument. Our work resolves the long standing issue of whether Dawid-Skene estimator has sound theoretical guarantees besides its good performance observed in practice. In addition, a comparative study with majority voting illustrates both advantages and pitfalls of the Dawid-Skene estimator.
May-30-2016
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Connecticut > New Haven County
- New Haven (0.04)
- Washington > King County
- Redmond (0.04)
- Connecticut > New Haven County
- Asia > Middle East
- Genre:
- Research Report (0.50)
- Technology: