🤖 AI Summary
Arabic hate speech detection is highly challenging due to extensive dialectal diversity. To address this, we introduce the first multi-label Arabic hate speech dataset specifically curated from Twitter (10,000 tweets), annotated with fine-grained labels for offensiveness and six target categories: religion, gender, politics, race, origin, and other. The dataset supports both single- and multi-target classification. We propose a novel Arabic multi-target hate speech annotation framework, featuring a high-agreement crowdsourcing protocol (Krippendorff’s α = 0.86 for offensiveness, 0.71 for targets) and a systematic cross-dialect coverage strategy. Using AraBERTv2 and other Transformer-based models, we conduct fine-tuning and multi-label classification experiments, achieving a micro-F1 score of 0.7865 and accuracy of 0.786. These results empirically validate the dataset’s quality, the robustness of our annotation framework, and the suitability of modern Arabic language models for multi-target hate speech detection.
📝 Abstract
Identifying hate speech content in the Arabic language is challenging due to the rich quality of dialectal variations. This study introduces a multilabel hate speech dataset in the Arabic language. We have collected 10000 Arabic tweets and annotated each tweet, whether it contains offensive content or not. If a text contains offensive content, we further classify it into different hate speech targets such as religion, gender, politics, ethnicity, origin, and others. A text can contain either single or multiple targets. Multiple annotators are involved in the data annotation task. We calculated the inter-annotator agreement, which was reported to be 0.86 for offensive content and 0.71 for multiple hate speech targets. Finally, we evaluated the data annotation task by employing a different transformers-based model in which AraBERTv2 outperformed with a micro-F1 score of 0.7865 and an accuracy of 0.786.