Rajshahi Division
RoadSens-4M: A Multimodal Smartphone & Camera Dataset for Holistic Road-way Analysis
Khandakar, Amith, Michelson, David, Rabbani, Shaikh Golam, Shafi, Fariya Bintay, Ahamed, Md. Faysal, Rahman, Khondokar Radwanur, Rahman, Md Abidur, Nabi, Md. Fahmidun, Ayari, Mohamed Arselene, Khan, Khaled, Suganthan, Ponnuthurai Nagaratnam
It's important to monitor road issues such as bumps and potholes to enhance safety and improve road conditions. Smartphones are equipped with various built - in sensors that offer a cost - effective and straightforward way to assess road quality. However, prog ress in this area has been slow due to the lack of high - quality, standardized datasets. This paper discusses a new dataset created by a mobile app that collects sensor data from devices like GPS, accelerometers, gyroscopes, magnetometers, gravity sensors, and orientation sensors. This dataset is one of the few that integrates Geographic Information System (GIS) data with weather information and video footage of road conditions, providing a comprehensive understanding of road issues with geographic context . The dataset allows for a clearer analysis of road conditions by compiling essential data, including vehicle speed, acceleration, rotation rates, and magnetic field intensity, along with the visual and spatial context provided by GIS, weather, and video dat a. Its goal is to provide funding for initiatives that enhance traffic management, infrastructure development, road safety, and urban planning . Additionally, the dataset will be publicly accessible to promote further research and innovation in smart transp ortation systems.
- North America > United States (0.14)
- Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
- North America > Canada > British Columbia (0.04)
- (9 more...)
- Information Technology (1.00)
- Transportation > Ground > Road (0.88)
RegSpeech12: A Regional Corpus of Bengali Spontaneous Speech Across Dialects
Hassan, Md. Rezuwan, Hossain, Azmol, Fatema, Kanij, Faruque, Rubayet Sabbir, Shome, Tanmoy, Naswan, Ruwad, Chakraborty, Trina, Zihad, Md. Foriduzzaman, Dipto, Tawsif Tashwar, Tasnim, Nazia, Ansary, Nazmuddoha, Shawon, Md. Mehedi Hasan, Humayun, Ahmed Imtiaz, Alam, Md. Golam Rabiul, Sadeque, Farig, Sushmit, Asif
The Bengali language, spoken extensively across South Asia and among diasporic communities, exhibits considerable dialectal diversity shaped by geography, culture, and history. Phonological and pronunciation-based classifications broadly identify five principal dialect groups: Eastern Bengali, Manbhumi, Rangpuri, Varendri, and Rarhi. Within Bangladesh, further distinctions emerge through variation in vocabulary, syntax, and morphology, as observed in regions such as Chittagong, Sylhet, Rangpur, Rajshahi, Noakhali, and Barishal. Despite this linguistic richness, systematic research on the computational processing of Bengali dialects remains limited. This study seeks to document and analyze the phonetic and morphological properties of these dialects while exploring the feasibility of building computational models particularly Automatic Speech Recognition (ASR) systems tailored to regional varieties. Such efforts hold potential for applications in virtual assistants and broader language technologies, contributing to both the preservation of dialectal diversity and the advancement of inclusive digital tools for Bengali-speaking communities. The dataset created for this study is released for public use.
- Asia > Bangladesh > Rangpur Division > Rangpur District > Rangpur (0.25)
- Asia > India (0.05)
- South America > Brazil (0.04)
- (6 more...)
GroundHog: Revolutionizing GLDAS Groundwater Storage Downscaling for Enhanced Recharge Estimation in Bangladesh
Ahmed, Saleh Sakib, Zzaman, Rashed Uz, Jony, Saifur Rahman, Himel, Faizur Rahman, Sharmin, Afroza, Rahman, A. H. M. Khalequr, Rahman, M. Sohel, Nowreen, Sara
Long-term groundwater level (GWL) measurement is vital for effective policymaking and recharge estimation using annual maxima and minima. However, current methods prioritize short-term predictions and lack multi-year applicability, limiting their utility. Moreover, sparse in-situ measurements lead to reliance on low-resolution satellite data like GLDAS as the ground truth for Machine Learning models, further constraining accuracy. To overcome these challenges, we first develop an ML model to mitigate data gaps, achieving $R^2$ scores of 0.855 and 0.963 for maximum and minimum GWL predictions, respectively. Subsequently, using these predictions and well observations as ground truth, we train an Upsampling Model that uses low-resolution (25 km) GLDAS data as input to produce high-resolution (2 km) GWLs, achieving an excellent $R^2$ score of 0.96. Our approach successfully upscales GLDAS data for 2003-2024, allowing high-resolution recharge estimations and revealing critical trends for proactive resource management. Our method allows upsampling of groundwater storage (GWS) from GLDAS to high-resolution GWLs for any points independently of officially curated piezometer data, making it a valuable tool for decision-making.
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.06)
- Asia > India > Maharashtra (0.04)
- Asia > China (0.04)
- (15 more...)
PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream
Yoon, Susik, Chan, Hou Pong, Han, Jiawei
Summarizing text-rich documents has been long studied in the literature, but most of the existing efforts have been made to summarize a static and predefined multi-document set. With the rapid development of online platforms for generating and distributing text-rich documents, there arises an urgent need for continuously summarizing dynamically evolving multi-document sets where the composition of documents and sets is changing over time. This is especially challenging as the summarization should be not only effective in incorporating relevant, novel, and distinctive information from each concurrent multi-document set, but also efficient in serving online applications. In this work, we propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS), and introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization. PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents while preserving accumulated knowledge from previous documents. To update new summaries, the most representative sentences for each multi-document set are extracted by measuring their similarities to the prototypes. A thorough evaluation with real multi-document sets streams demonstrates that PDSum outperforms state-of-the-art unsupervised multi-document summarization algorithms in EMDS in terms of relevance, novelty, and distinctiveness and is also robust to various evaluation settings.
- South America > Venezuela (0.28)
- North America > United States > Texas > Travis County > Austin (0.05)
- North America > Cuba (0.04)
- (11 more...)
- Research Report (1.00)
- Overview (0.93)