Goto

Collaborating Authors

 Li, Haozhou


SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults

arXiv.org Artificial Intelligence

The reliability of substation equipment is crucial to the stability of power systems, but traditional fault analysis methods heavily rely on manual expertise, limiting their effectiveness in handling complex and large-scale data. This paper proposes a substation equipment fault analysis method based on a multimodal large language model (MLLM). We developed a database containing 40,000 entries, including images, defect labels, and analysis reports, and used an image-to-video generation model for data augmentation. Detailed fault analysis reports were generated using GPT-4. Based on this database, we developed SubstationAI, the first model dedicated to substation fault analysis, and designed a fault diagnosis knowledge base along with knowledge enhancement methods. Experimental results show that SubstationAI significantly outperforms existing models, such as GPT-4, across various evaluation metrics, demonstrating higher accuracy and practicality in fault cause analysis, repair suggestions, and preventive measures, providing a more advanced solution for substation equipment fault analysis.


LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation

arXiv.org Artificial Intelligence

The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), comprising a Coarse-grained Triage dataset with 439,630 samples, a Fine-grained Diagnosis dataset with 199,600 samples, and a Medical Consultation dataset with 472,418 items, thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model using reinforcement learning. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems.