HATE-ITA: Hate Speech Detection in Italian Social Media Text

Debora Nozza, Federico Bianchi, Giuseppe Attanasio

July 2022

PDF Code Poster Slides

HATE-ITA logo

Abstract

Online hate speech is a dangerous phenomenon that can (and should) be promptly counteracted properly. While Natural Language Processing supplies appropriate algorithms for trying to reach this objective, all research efforts are directed toward the English language. This strongly limits the classification power on non-English languages. In this paper, we test several learning frameworks for identifying hate speech in Italian text. We release HATE-ITA, a multi-language model trained on a large set of English data and available Italian datasets. HATE-ITA performs better than mono-lingual models and seems to adapt well also on language-specific slurs. We hope our findings will encourage the research in other mid-to-low resource communities and provide a valuable benchmarking tool for the Italian community.

Type

Conference paper

Publication

Sixth Workshop on Online Abuse and Harms (WOAH) at NAACL 2022

HATE-ITA: Hate Speech Detection in Italian Social Media Text

Abstract

Related