Hate Speech Classifiers are Culturally Insensitive

Nayeon Lee(Korea Advanced Institute of Science and Technology), Chani Jung(Korea Advanced Institute of Science and Technology), Alice Oh(Korea Advanced Institute of Science and Technology)
Unknown
January 1, 2023
Cited by 17Open Access
Full Text

Abstract

Increasingly, language models and machine translation are becoming valuable tools to help people communicate with others from diverse cultural backgrounds. However, current language models lack cultural awareness because they are trained on data representing only the culture within the dataset. This presents a problem in the context of hate speech classification, where cultural awareness is especially critical. This study aims to quantify the cultural insensitivity of three monolingual (Korean, English, Arabic) hate speech classifiers by evaluating their performance on translated datasets from the other two languages. Our research has revealed that hate speech classifiers evaluated on datasets from other cultures yield significantly lower F1 scores, up to almost 50%. In addition, they produce considerably higher false negative rates, with a magnitude up to five times greater, demonstrating the extent of the cultural gap. The study highlights the severity of cultural insensitivity of language models in hate speech classification.


Related Papers

No related papers found

Powered by citation graph analysis