Hate Speech

MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection

We present the system proposed by the MilaNLP team for the Explainable Detection of Online Sexism (EDOS) shared task. We propose an ensemble modeling approach to combine different classifiers trained with domain adaptation objectives and standard …

Respectful or Toxic? Using Zero-Shot Learning with Language Models to Detect Hate Speech

Hate speech detection faces two significant challenges: 1) the limited availability of labeled data and 2) the high variability of hate speech across different contexts and languages. Prompting brings a ray of hope to these challenges. It allows …

The State of Profanity Obfuscation in Natural Language Processing Scientific Publications

Work on hate speech has made considering rude and harmful examples in scientific publications inevitable. This situation raises various problems, such as whether or not to obscure profanities. While science must accurately disclose what it does, the …

A Cross-Lingual Study of Homotransphobia on Twitter

We present a cross-lingual study of homotransphobia on Twitter, examining the prevalence and forms of homotransphobic content in tweets related to LGBT issues in seven languages. Our findings reveal that homotransphobia is a global problem that takes …

It's Not Just Hate: A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for …

Measuring Harmful Representations in Scandinavian Language Models

Scandinavian countries are perceived as role-models when it comes to gender equality. With the advent of pre-trained language models and their widespread usage, we investigate to what extent gender-based harmful and toxic content exist in selected …

Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages

Hate speech is a global phenomenon, but most hate speech datasets so far focus on English-language content. This hinders the development of more effective hate speech detection models in hundreds of languages spoken by billions across the world. More …

HATE-ITA: Hate Speech Detection in Italian Social Media Text

Online hate speech is a dangerous phenomenon that can (and should) be promptly counteracted properly. While Natural Language Processing supplies appropriate algorithms for trying to reach this objective, all research efforts are directed toward the …

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Hate speech detection models are typically evaluated on held-out test sets. However, this risks painting an incomplete and potentially misleading picture of model performance because of increasingly well-documented systematic gaps and biases in hate …

Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection

Transformer-based Natural Language Processing models have become the standard for hate speech detection. However, the unconscious use of these techniques for such a critical task comes with negative consequences. Various works have demonstrated that …