Specializing General-purpose LLM Embeddings for Implicit Hate Speech Detection across Datasets

📅 2025-08-28

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Implicit hate speech (IHS) detection is challenging due to the absence of overt slurs and reliance on irony, implication, or coded language. This paper proposes a lightweight, efficient approach: fine-tuning only the embedding layers of general-purpose large language models (LLMs)—such as Stella, Jasper, NV-Embed, and E5—at a fine-grained level, without incorporating external knowledge or auxiliary modules. The method significantly enhances semantic representation capability for IHS. Its core contribution lies in empirically validating the strong cross-dataset generalizability of pure embedding fine-tuning—a paradigm previously underexplored for IHS. Experiments demonstrate up to a 1.10 percentage point improvement in macro-F1 on in-domain evaluation and up to a 20.35 percentage point gain in cross-dataset settings. This offers a scalable, easily deployable solution for IHS detection under low-resource conditions and without supervised pretraining assumptions.

Technology Category

Application Category

📝 Abstract

Implicit hate speech (IHS) is indirect language that conveys prejudice or hatred through subtle cues, sarcasm or coded terminology. IHS is challenging to detect as it does not include explicit derogatory or inflammatory words. To address this challenge, task-specific pipelines can be complemented with external knowledge or additional information such as context, emotions and sentiment data. In this paper, we show that, by solely fine-tuning recent general-purpose embedding models based on large language models (LLMs), such as Stella, Jasper, NV-Embed and E5, we achieve state-of-the-art performance. Experiments on multiple IHS datasets show up to 1.10 percentage points improvements for in-dataset, and up to 20.35 percentage points improvements in cross-dataset evaluation, in terms of F1-macro score.

Problem

Research questions and friction points this paper is trying to address.

Detecting implicit hate speech using subtle cues and sarcasm

Improving cross-dataset generalization for hate speech detection

Fine-tuning general-purpose LLM embeddings for specialized tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning general-purpose LLM embeddings

Achieving state-of-the-art detection performance

Improving cross-dataset evaluation significantly

🔎 Similar Papers

No similar papers found.

Authors to Follow