Goto

Collaborating Authors

 Pan, Xiao


Post-disaster building indoor damage and survivor detection using autonomous path planning and deep learning with unmanned aerial vehicles

arXiv.org Artificial Intelligence

Rapid response to natural disasters such as earthquakes is a crucial element in ensuring the safety of civil infrastructures and minimizing casualties. Traditional manual inspection is labour-intensive, time-consuming, and can be dangerous for inspectors and rescue workers. This paper proposed an autonomous inspection approach for structural damage inspection and survivor detection in the post-disaster building indoor scenario, which incorporates an autonomous navigation method, deep learning-based damage and survivor detection method, and a customized low-cost micro aerial vehicle (MAV) with onboard sensors. Experimental studies in a pseudo-post-disaster office building have shown the proposed methodology can achieve high accuracy in structural damage inspection and survivor detection. Overall, the proposed inspection approach shows great potential to improve the efficiency of existing manual post-disaster building inspection.


Research on Foundation Model for Spatial Data Intelligence: China's 2024 White Paper on Strategic Development of Spatial Data Intelligence

arXiv.org Artificial Intelligence

Research status and development trends; on this basis, this report proposes three major challenges faced by large spatial data intelligent models today. This report focuses on the current research status of spatial data intelligent large-scale models and sorts out the research progress in four major thematic areas of spatial data intelligent large-scale models: cities, air and space remote sensing, geography, and transportation. This report systematically introduces the key technologies, characteristics and advantages, research status, future development and other core information of spatial data intelligent large models, involving spatiotemporal big data platforms, distributed computing, 3D virtual reality, space The basic performance of large models such as analysis and visualization, as well as the complex spatial comprehensive performance of large models such as geospatial intelligent computing, deep learning, high-performance processing of big data, geographical knowledge graphs, and geographical intelligent multi-scenario simulation, analyze the application of the above key technologies in spatial data The location and role of smart large models.


Masked Audio Text Encoders are Effective Multi-Modal Rescorers

arXiv.org Artificial Intelligence

Masked Language Models (MLMs) have proven to be effective for second-pass rescoring in Automatic Speech Recognition (ASR) systems. In this work, we propose Masked Audio Text Encoder (MATE), a multi-modal masked language model rescorer which incorporates acoustic representations into the input space of MLM. We adopt contrastive learning for effectively aligning the modalities by learning shared representations. We show that using a multi-modal rescorer is beneficial for domain generalization of the ASR system when target domain data is unavailable. MATE reduces word error rate (WER) by 4%-16% on in-domain, and 3%-7% on out-of-domain datasets, over the text-only baseline. Additionally, with very limited amount of training data (0.8 hours), MATE achieves a WER reduction of 8%-23% over the first-pass baseline.