PANDA -- Paired Anti-hate Narratives Dataset from Asia: Using an LLM-as-a-Judge to Create the First Chinese Counterspeech Dataset

📅 2025-01-01

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Chinese hate speech research faces two critical bottlenecks: (1) a lack of open, high-quality, pairwise-labeled counterspeech datasets grounded in Mainland China’s sociocultural context; and (2) insufficient reliability of mainstream LLM-as-a-Judge evaluation methods under Chinese cultural norms. Method: We introduce CS-Chn—the first publicly available, culturally adaptive counterspeech dataset for Modern Standard Chinese in Mainland China—employing an innovative LLM-as-a-Judge framework integrated with simulated annealing and rotational sampling to generate candidate responses, followed by multi-stage human verification and cultural sensitivity review. Contribution/Results: Empirical analysis reveals severe scarcity of Chinese hate speech data and systematic biases in LLM-based scoring. We release CS-Chn: an open, high-quality, human-annotated, pairwise dataset comprising 12K samples. CS-Chn establishes a benchmark resource and methodological paradigm for Chinese counterspeech research.

Technology Category

Application Category

📝 Abstract

Despite the global prevalence of Modern Standard Chinese language, counterspeech (CS) resources for Chinese remain virtually nonexistent. To address this gap in East Asian counterspeech research we introduce the a corpus of Modern Standard Mandarin counterspeech that focuses on combating hate speech in Mainland China. This paper proposes a novel approach of generating CS by using an LLM-as-a-Judge, simulated annealing, LLMs zero-shot CN generation and a round-robin algorithm. This is followed by manual verification for quality and contextual relevance. This paper details the methodology for creating effective counterspeech in Chinese and other non-Eurocentric languages, including unique cultural patterns of which groups are maligned and linguistic patterns in what kinds of discourse markers are programmatically marked as hate speech (HS). Analysis of the generated corpora, we provide strong evidence for the lack of open-source, properly labeled Chinese hate speech data and the limitations of using an LLM-as-Judge to score possible answers in Chinese. Moreover, the present corpus serves as the first East Asian language based CS corpus and provides an essential resource for future research on counterspeech generation and evaluation.

Problem

Research questions and friction points this paper is trying to address.

Modern Mandarin

Hate Speech

Data Resource

Innovation

Methods, ideas, or system contributions that make the work stand out.

PANDA Corpus

Simulated Annealing with Language Models

Zero-shot Generation and Round-robin Algorithm

🔎 Similar Papers

No similar papers found.