BeliN: A Novel Corpus for Bengali Religious News Headline Generation using Contextual Feature Fusion
Osama, Md, Dey, Ashim, Ahmed, Kawsar, Kabir, Muhammad Ashad
–arXiv.org Artificial Intelligence
Automatic text summarization, particularly headline generation, remains a critical yet underexplored area for Bengali religious news. Existing approaches to headline generation typically rely solely on the article content, overlooking crucial contextual features such as sentiment, category, and aspect. This limitation significantly hinders their effectiveness and overall performance. This study addresses this limitation by introducing a novel corpus, BeliN (Bengali Religious News) - comprising religious news articles from prominent Bangladeshi online newspapers, and MultiGen - a contextual multi-input feature fusion headline generation approach. Leveraging transformer-based pre-trained language models such as BanglaT5, mBART, mT5, and mT0, MultiGen integrates additional contextual features - including category, aspect, and sentiment - with the news content. This fusion enables the model to capture critical contextual information often overlooked by traditional methods. Experimental results demonstrate the superiority of MultiGen over the baseline approach that uses only news content, achieving a BLEU score of 18.61 and ROUGE-L score of 24.19, compared to baseline approach scores of 16.08 and 23.08, respectively. These findings underscore the importance of incorporating contextual features in headline generation for low-resource languages. By bridging linguistic and cultural gaps, this research advances natural language processing for Bengali and other underrepresented languages. To promote reproducibility and further exploration, the dataset and implementation code are publicly accessible at https://github.com/akabircs/BeliN.
arXiv.org Artificial Intelligence
Jan-2-2025
- Country:
- Asia
- Bangladesh
- Dhaka Division > Dhaka District
- Dhaka (0.04)
- Rangpur Division > Rangpur District
- Rangpur (0.04)
- Dhaka Division > Dhaka District
- China > Hong Kong (0.04)
- Middle East
- Saudi Arabia (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Pakistan (0.04)
- Bangladesh
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Switzerland (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- New York > New York County
- New York City (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Texas > Travis County
- Austin (0.04)
- Louisiana > Orleans Parish
- Canada > Ontario
- Oceania > Australia (0.04)
- Asia
- Genre:
- Research Report > New Finding (1.00)
- Technology: