BanglaNirTox: A Large-scale Parallel Corpus for Explainable AI in Bengali Text Detoxification

Mohsin, Ayesha Afroza, Ahsan, Mashrur, Maliyat, Nafisa, Maria, Shanta, Raiyan, Syed Rifat, Mahmud, Hasan, Hasan, Md Kamrul

arXiv.org Artificial Intelligence 

Toxic language in Bengali remains prevalent, especially in online environments, with few effective precautions against it. Although text detoxification has seen progress in high-resource languages, Bengali remains under-explored due to limited resources. In this paper, we propose a novel pipeline for Bengali text detoxification that combines Pareto class-optimized large language models (LLMs) and Chain-of-Thought (Co T) prompting to generate detoxified sentences. To support this effort, we construct BANGLANIRTOX, an artificially generated parallel corpus of 68,041 toxic Bengali sentences with class-wise toxicity labels, reasonings, and detoxified paraphrases, using Pareto-optimized LLMs evaluated on random samples. The resulting BANGLANIRTOX dataset is used to fine-tune language models to produce better detoxified versions of Bengali sentences. Our findings show that Pareto-optimized LLMs with Co T prompting significantly enhance the quality and consistency of Bengali text detoxification. Warning: This paper contains examples of toxic and offensive language.