Many tasks in Natural Language Processing (NLP) and Computer Vision (CV) offer evidence that humans disagree, from objective tasks such as part-of-speech tagging to more subjective tasks such as classifying an image or deciding whether a proposition …
Recently, Peterson et al. provided evidence of the benefits of using probabilistic soft labels generated from crowd annotations for training a computer vision model, showing that using such labels maximizes performance of the models over unseen data. …
The analysis of crowdsourced annotations in natural language processing is concerned with identifying (1) gold standard labels, (2) annotator accuracies and biases, and (3) item difficulties and error patterns. Traditionally, majority voting was used …