Shahariar, G M
Gender Bias Mitigation for Bangla Classification Tasks
Joy, Sajib Kumar Saha, Mahy, Arman Hassan, Sultana, Meherin, Abha, Azizah Mamun, Ahmmed, MD Piyal, Dong, Yue, Shahariar, G M
In this study, we investigate gender bias in Bangla pretrained language models, a largely under explored area in low-resource languages. To assess this bias, we applied gender-name swapping techniques to existing datasets, creating four manually annotated, task-specific datasets for sentiment analysis, toxicity detection, hate speech detection, and sarcasm detection. By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias. We then proposed a joint loss optimization technique to mitigate gender bias across task-specific pretrained models. Our approach was evaluated against existing bias mitigation methods, with results showing that our technique not only effectively reduces bias but also maintains competitive accuracy compared to other baseline approaches. To promote further research, we have made both our implementation and datasets publicly available https://github.com/sajib-kumar/Gender-Bias-Mitigation-From-Bangla-PLM
Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation
Shahariar, G M, Chen, Jia, Li, Jiachen, Dong, Yue
Recent studies show that text-to-image (T2I) models are vulnerable to adversarial attacks, especially with noun perturbations in text prompts. In this study, we investigate the impact of adversarial attacks on different POS tags within text prompts on the images generated by T2I models. We create a high-quality dataset for realistic POS tag token swapping and perform gradient-based attacks to find adversarial suffixes that mislead T2I models into generating images with altered tokens. Our empirical results show that the attack success rate (ASR) varies significantly among different POS tag categories, with nouns, proper nouns, and adjectives being the easiest to attack. We explore the mechanism behind the steering effect of adversarial suffixes, finding that the number of critical tokens and content fusion vary among POS tags, while features like suffix transferability are consistent across categories. We have made our implementation publicly available at - https://github.com/shahariar-shibli/Adversarial-Attack-on-POS-Tags.