Goto

Collaborating Authors

 Đắk Nông Province


A comprehensive GeoAI review: Progress, Challenges and Outlooks

arXiv.org Artificial Intelligence

In recent years, Geospatial Artificial Intelligence (GeoAI) has gained traction in the most relevant research works and industrial applications, while also becoming involved in various fields of use. This paper offers a comprehensive review of GeoAI as a synergistic concept applying Artificial Intelligence (AI) methods and models to geospatial data. A preliminary study is carried out, identifying the methodology of the work, the research motivations, the issues and the directions to be tracked, followed by exploring how GeoAI can be used in various interesting fields of application, such as precision agriculture, environmental monitoring, disaster management and urban planning. Next, a statistical and semantic analysis is carried out, followed by a clear and precise presentation of the challenges facing GeoAI. Then, a concrete exploration of the future prospects is provided, based on several informations gathered during the census. To sum up, this paper provides a complete overview of the correlation between AI and the geospatial domain, while mentioning the researches conducted in this context, and emphasizing the close relationship linking GeoAI with other advanced concepts such as geographic information systems (GIS) and large-scale geospatial data, known as big geodata. This will enable researchers and scientific community to assess the state of progress in this promising field, and will help other interested parties to gain a better understanding of the issues involved.


Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges

arXiv.org Artificial Intelligence

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.


Natural Disaster Analysis using Satellite Imagery and Social-Media Data for Emergency Response Situations

arXiv.org Artificial Intelligence

Disaster Management is one of the most promising research areas because of its significant economic, environmental and social repercussions. This research focuses on analyzing different types of data (pre and post satellite images and twitter data) related to disaster management for in-depth analysis of location-wise emergency requirements. This research has been divided into two stages, namely, satellite image analysis and twitter data analysis followed by integration using location. The first stage involves pre and post disaster satellite image analysis of the location using multi-class land cover segmentation technique based on U-Net architecture. The second stage focuses on mapping the region with essential information about the disaster situation and immediate requirements for relief operations. The severely affected regions are demarcated and twitter data is extracted using keywords respective to that location. The extraction of situational information from a large corpus of raw tweets adopts Content Word based Tweet Summarization (COWTS) technique. An integration of these modules using real-time location-based mapping and frequency analysis technique gathers multi-dimensional information in the advent of disaster occurrence such as the Kerala and Mississippi floods that were analyzed and validated as test cases. The novelty of this research lies in the application of segmented satellite images for disaster relief using highlighted land cover changes and integration of twitter data by mapping these region-specific filters for obtaining a complete overview of the disaster.