๐ค AI Summary
Harmful online content in Bengali is pervasive, yet high-quality detoxification resources and methods remain scarce for low-resource languages. Method: We propose the first explainable text detoxification framework tailored to low-resource languages, integrating Pareto-optimal large language models with chain-of-thought (CoT) prompting. This yields BanglaNirToxโa manually verified, attribution-annotated parallel corpus of 68,000 Bengali detoxification instances. Contribution/Results: Our framework pioneers the integration of multi-objective Pareto optimization with explainable reasoning, significantly improving detoxification quality, consistency, and transparency. BanglaNirTox and the proposed methodology establish critical infrastructure and a novel technical paradigm for explainable AI and content safety research in low-resource linguistic settings.
๐ Abstract
Toxic language in Bengali remains prevalent, especially in online environments, with few effective precautions against it. Although text detoxification has seen progress in high-resource languages, Bengali remains underexplored due to limited resources. In this paper, we propose a novel pipeline for Bengali text detoxification that combines Pareto class-optimized large language models (LLMs) and Chain-of-Thought (CoT) prompting to generate detoxified sentences. To support this effort, we construct BanglaNirTox, an artificially generated parallel corpus of 68,041 toxic Bengali sentences with class-wise toxicity labels, reasonings, and detoxified paraphrases, using Pareto-optimized LLMs evaluated on random samples. The resulting BanglaNirTox dataset is used to fine-tune language models to produce better detoxified versions of Bengali sentences. Our findings show that Pareto-optimized LLMs with CoT prompting significantly enhance the quality and consistency of Bengali text detoxification.