Hard and Soft Evaluation of NLP models with BOOtSTrap SAmpling - BooStSa
Learning from Disagreement: A Survey
BERTective: Language Models and Contextual Information for Deception Detection
A Case for Soft Loss Functions
Fake opinion detection: how similar are crowdsourced datasets to real data?
Comparing Bayesian Models of Annotation