BanglaMM-Disaster: A Multimodal Transformer-Based Deep Learning Framework for Multiclass Disaster Classification in Bangla
Islam, Ariful, Hossen, Md Rifat, Arif, Md. Mahmudul, Noman, Abdullah Al, Rahman, Md Arifur
–arXiv.org Artificial Intelligence
Natural disasters remain a major challenge for Bangladesh, so real-time monitoring and quick response systems are essential. In this study, we present BanglaMM-Disaster, an end-to-end deep learning-based multimodal framework for disaster classification in Bangla, using both textual and visual data from social media. We constructed a new dataset of 5,037 Bangla social media posts, each consisting of a caption and a corresponding image, annotated into one of nine disaster-related categories. The proposed model integrates transformer-based text encoders, including BanglaBERT, mBERT, and XLM-RoBERTa, with CNN backbones such as ResNet50, DenseNet169, and MobileNetV2, to process the two modalities. Using early fusion, the best model achieves 83.76% accuracy. This surpasses the best text-only baseline by 3.84% and the image-only baseline by 16.91%. Our analysis also shows reduced misclassification across all classes, with noticeable improvements for ambiguous examples. This work fills a key gap in Bangla multimodal disaster analysis and demonstrates the benefits of combining multiple data types for real-time disaster response in low-resource settings.
arXiv.org Artificial Intelligence
Nov-27-2025
- Country:
- Africa > Angola (0.04)
- Asia > Bangladesh (0.26)
- North America > United States
- Delaware > New Castle County > New Castle (0.04)
- Genre:
- Research Report > New Finding (0.48)
- Technology: