Gradient Masters at BLP-2025 Task 1: Advancing Low-Resource NLP for Bengali using Ensemble-Based Adversarial Training for Hate Speech Detection

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This work addresses hate speech detection in low-resource Bangla YouTube comments. We propose a multi-stage fine-tuning framework built upon a pre-trained Bangla language model, integrating heterogeneous models into a hybrid ensemble and incorporating adversarial training to enhance robustness and generalization. The method jointly tackles two fine-grained subtasks: hate speech classification (Subtask 1A) and target group identification (Subtask 1B). Evaluated on SemEval-2024 Task 12, our approach achieves 73.23% micro-F1 on Subtask 1A (6th place) and 73.28% micro-F1 on the more challenging Subtask 1B (3rd place), substantially outperforming baseline systems. Our key contribution is the first synergistic application of ensemble learning, adversarial training, and domain-adaptive fine-tuning to low-resource Bangla hate speech detection—effectively mitigating overfitting and distributional shift induced by data scarcity.

Technology Category

Application Category

📝 Abstract

This paper introduces the approach of "Gradient Masters" for BLP-2025 Task 1: "Bangla Multitask Hate Speech Identification Shared Task". We present an ensemble-based fine-tuning strategy for addressing subtasks 1A (hate-type classification) and 1B (target group classification) in YouTube comments. We propose a hybrid approach on a Bangla Language Model, which outperformed the baseline models and secured the 6th position in subtask 1A with a micro F1 score of 73.23% and the third position in subtask 1B with 73.28%. We conducted extensive experiments that evaluated the robustness of the model throughout the development and evaluation phases, including comparisons with other Language Model variants, to measure generalization in low-resource Bangla hate speech scenarios and data set coverage. In addition, we provide a detailed analysis of our findings, exploring misclassification patterns in the detection of hate speech.

Problem

Research questions and friction points this paper is trying to address.

Developing ensemble-based adversarial training for Bengali hate speech detection

Addressing low-resource NLP challenges in Bangla language YouTube comments

Improving classification of hate types and target groups in Bengali content

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble-based fine-tuning strategy for hate speech

Hybrid approach on Bangla Language Model

Adversarial training for low-resource NLP scenarios

🔎 Similar Papers

No similar papers found.