Expert Systems
Transformer-Based Bearing Fault Detection using Temporal Decomposition Attention Mechanism
Mirzaeibonehkhater, Marzieh, Labbaf-Khaniki, Mohammad Ali, Manthouri, Mohammad
Bearing fault detection is a critical task in predictive maintenance, where accurate and timely fault identification can prevent costly downtime and equipment damage. Traditional attention mechanisms in Transformer neural networks often struggle to capture the complex temporal patterns in bearing vibration data, leading to suboptimal performance. To address this limitation, we propose a novel attention mechanism, Temporal Decomposition Attention (TDA), which combines temporal bias encoding with seasonal-trend decomposition to capture both long-term dependencies and periodic fluctuations in time series data. Additionally, we incorporate the Hull Exponential Moving Average (HEMA) for feature extraction, enabling the model to effectively capture meaningful characteristics from the data while reducing noise. Our approach integrates TDA into the Transformer architecture, allowing the model to focus separately on the trend and seasonal components of the data. Experimental results on the Case Western Reserve University (CWRU) bearing fault detection dataset demonstrate that our approach outperforms traditional attention mechanisms and achieves state-of-the-art performance in terms of accuracy and interpretability. The HEMA-Transformer-TDA model achieves an accuracy of 98.1%, with exceptional precision, recall, and F1-scores, demonstrating its effectiveness in bearing fault detection and its potential for application in other time series tasks with seasonal patterns or trends.
Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
Liao, Huanxuan, He, Shizhu, Xu, Yao, Zhang, Yuanzhe, Liu, Kang, Zhao, Jun
In this paper, we propose $\textbf{Ne}$ural-$\textbf{Sy}$mbolic $\textbf{C}$ollaborative $\textbf{D}$istillation ($\textbf{NesyCD}$), a novel knowledge distillation method for learning the complex reasoning abilities of Large Language Models (LLMs, e.g., \textgreater 13B). We argue that complex reasoning tasks are difficult for Small Language Models (SLMs, e.g., $\leq$ 7B), as these tasks demand not only general cognitive abilities but also specialized knowledge, which is often sparse and difficult for these neural-based SLMs to effectively capture. Therefore, NesyCD distills the general capabilities and specialized knowledge in LLMs using different manners. On the one hand, we distill only general abilities from teacher LLMs into the student SLMs of parameterized neural networks. On the other hand, for the specialized abilities and uncommon knowledge of a complex reasoning task, we employ a symbolic knowledge distillation approach to obtain and store the specialized knowledge within a symbolic knowledge base (KB). By decoupling general and specialized capabilities, the proposed NesyCD can achieve superior performance cost-effectively, utilizing smaller models and blending parameterized neural networks with symbolic KB. Moreover, the specialized KB generalizes well and is comprehended and manipulated by humans. Our experiments show that NesyCD significantly boosts SLMs' complex reasoning performance on in-domain (BBH, GSM8K) and out-of-domain (AGIEval, ARC) datasets. Notably, our approach enabled the LLaMA3-8B and Qwen2-7B to surpass GPT-3.5-turbo in performance and come close to matching LLaMA3-70B, despite the latter having nine times more parameters. Our code will be available at https://github.com/Xnhyacinth/NesyCD.
$\textit{SKIntern}$: Internalizing Symbolic Knowledge for Distilling Better CoT Capabilities into Small Language Models
Liao, Huanxuan, He, Shizhu, Hao, Yupu, Li, Xiang, Zhang, Yuanzhe, Zhao, Jun, Liu, Kang
Small Language Models (SLMs) are attracting attention due to the high computational demands and privacy concerns of Large Language Models (LLMs). Some studies fine-tune SLMs using Chains of Thought (CoT) data distilled from LLMs, aiming to enhance their reasoning ability. Furthermore, Some CoT distillation methods introduce external symbolic knowledge into the generation process to improve the limited knowledge memory, reasoning ability and out-of-domain (OOD) generalization of SLMs. However, the introduction of symbolic knowledge increases computational overhead and introduces potential noise. In this paper, we introduce $\textit{SKIntern}$, an innovative approach that empowers SLMs to internalize symbolic knowledge and few-shot examples gradually through a progressive fine-tuning process, guided by a predefined linear decay schedule under curriculum learning. By efficiently internalizing knowledge, $\textit{SKIntern}$ reduces computational overhead and speeds up the reasoning process by focusing solely on the question during inference. It outperforms state-of-the-art baselines by over 5\%, while reducing inference costs (measured in FLOPs) by up to $4\times$ across a wide range of SLMs in both in-domain (ID) and out-of-domain (OOD) tasks. Our code will be available at \url{https://github.com/Xnhyacinth/SKIntern}.
Hybrid Model-Data Fault Diagnosis for Wafer Handler Robots: Tilt and Broken Belt Cases
van Esch, Tim, Ghanipoor, Farhad, Murguia, Carlos, van de Wouw, Nathan
This work proposes a hybrid model- and data-based scheme for fault detection, isolation, and estimation (FDIE) for a class of wafer handler (WH) robots. The proposed hybrid scheme consists of: 1) a linear filter that simultaneously estimates system states and fault-induced signals from sensing and actuation data; and 2) a data-driven classifier, in the form of a support vector machine (SVM), that detects and isolates the fault type using estimates generated by the filter. We demonstrate the effectiveness of the scheme for two critical fault types for WH robots used in the semiconductor industry: broken-belt in the lower arm of the WH robot (an abrupt fault) and tilt in the robot arms (an incipient fault). We derive explicit models of the robot motion dynamics induced by these faults and test the diagnostics scheme in a realistic simulation-based case study. These case study results demonstrate that the proposed hybrid FDIE scheme achieves superior performance compared to purely data-driven methods.
GEE-OPs: An Operator Knowledge Base for Geospatial Code Generation on the Google Earth Engine Platform Powered by Large Language Models
Hou, Shuyang, Liang, Jianyuan, Zhao, Anqi, Wu, Huayi
As the scale and complexity of spatiotemporal data continue to grow rapidly, the use of geospatial modeling on the Google Earth Engine (GEE) platform presents dual challenges: improving the coding efficiency of domain experts and enhancing the coding capabilities of interdisciplinary users. To address these challenges and improve the performance of large language models (LLMs) in geospatial code generation tasks, we propose a framework for building a geospatial operator knowledge base tailored to the GEE JavaScript API. This framework consists of an operator syntax knowledge table, an operator relationship frequency table, an operator frequent pattern knowledge table, and an operator relationship chain knowledge table. By leveraging Abstract Syntax Tree (AST) techniques and frequent itemset mining, we systematically extract operator knowledge from 185,236 real GEE scripts and syntax documentation, forming a structured knowledge base. Experimental results demonstrate that the framework achieves over 90% accuracy, recall, and F1 score in operator knowledge extraction. When integrated with the Retrieval-Augmented Generation (RAG) strategy for LLM-based geospatial code generation tasks, the knowledge base improves performance by 20-30%. Ablation studies further quantify the necessity of each knowledge table in the knowledge base construction. This work provides robust support for the advancement and application of geospatial code modeling techniques, offering an innovative approach to constructing domain-specific knowledge bases that enhance the code generation capabilities of LLMs, and fostering the deeper integration of generative AI technologies within the field of geoinformatics.
A Review of Intelligent Device Fault Diagnosis Technologies Based on Machine Vision
This paper provides a comprehensive review of mechanical equipment fault diagnosis methods, focusing on the advancements brought by Transformer-based models. It details the structure, working principles, and benefits of Transformers, particularly their self-attention mechanism and parallel computation capabilities, which have propelled their widespread application in natural language processing and computer vision. The discussion highlights key Transformer model variants, such as Vision Transformers (ViT) and their extensions, which leverage self-attention to improve accuracy and efficiency in visual tasks. Furthermore, the paper examines the application of Transformer-based approaches in intelligent fault diagnosis for mechanical systems, showcasing their superior ability to extract and recognize patterns from complex sensor data for precise fault identification. Despite these advancements, challenges remain, including the reliance on extensive labeled datasets, significant computational demands, and difficulties in deploying models on resource-limited devices. To address these limitations, the paper proposes future research directions, such as developing lightweight Transformer architectures, integrating multimodal data sources, and enhancing adaptability to diverse operational conditions. These efforts aim to further expand the application of Transformer-based methods in mechanical fault diagnosis, making them more robust, efficient, and suitable for real-world industrial environments.
Assisted morbidity coding: the SISCO.web use case for identifying the main diagnosis in Hospital Discharge Records
Cardillo, Elena, Frattura, Lucilla
The proper use of standard classifications, such as the International Classification of Diseases (ICD) and coding of morbidity data has always been fundamental for all general epidemiological and many health-management purposes (WHO, 2016). One example is the use of the information flow of the Hospital Discharge Records (SDO) collected in national databases for monitoring hospitalization episodes provided in public and private hospitals and thus the provision of hospital assistance. This has become an indispensable tool for both administrative analyses (i.e., for accurate billing) and clinical elaborations (e.g., health quality assessment), which can bring to the planning of new measures to support healthcare and welfare activities or to more strictly clinical-epidemiological and outcome analyses. In this frame, although approaches to coding vary across institutions, clinical coding specialists frequently perform coding retrospectively. The assignment of codes to each patient episode of care during hospitalization is determined by different factors, among others by the coder's interpretation of the available case notes or the completeness of the electronic health records. As a result, accurate coding is dependent on both the intelligibility of the case notes and the coders' knowledge of medical terminology (Sundararajan et al. 2015). Several studies have indicated poor reproducibility of clinical coding (Tatham A., 2008) and poor accuracy which seems not dependent on the version of the standard coding system used, which in the case of SDO is ICD (Quan et al. 2014). In recent years, even if the application of artificial intelligence (AI) has begun to attract and, in some cases, assist clinicians in the practice of medical coding, the performances achieved by AI models do not meet expectations.
Ontology-Aware RAG for Improved Question-Answering in Cybersecurity Education
Zhao, Chengshuai, Agrawal, Garima, Kumarage, Tharindu, Tan, Zhen, Deng, Yuli, Chen, Ying-Chih, Liu, Huan
Integrating AI into education has the potential to transform the teaching of science and technology courses, particularly in the field of cybersecurity. AI-driven question-answering (QA) systems can actively manage uncertainty in cybersecurity problem-solving, offering interactive, inquiry-based learning experiences. Large language models (LLMs) have gained prominence in AI-driven QA systems, offering advanced language understanding and user engagement. However, they face challenges like hallucinations and limited domain-specific knowledge, which reduce their reliability in educational settings. To address these challenges, we propose CyberRAG, an ontology-aware retrieval-augmented generation (RAG) approach for developing a reliable and safe QA system in cybersecurity education. CyberRAG employs a two-step approach: first, it augments the domain-specific knowledge by retrieving validated cybersecurity documents from a knowledge base to enhance the relevance and accuracy of the response. Second, it mitigates hallucinations and misuse by integrating a knowledge graph ontology to validate the final answer. Experiments on publicly available cybersecurity datasets show that CyberRAG delivers accurate, reliable responses aligned with domain knowledge, demonstrating the potential of AI tools to enhance education.
Ontology-driven Prompt Tuning for LLM-based Task and Motion Planning
Din, Muhayy Ud, Rosell, Jan, Akram, Waseem, Zaplana, Isiah, Roa, Maximo A, Seneviratne, Lakmal, Hussain, Irfan
Performing complex manipulation tasks in dynamic environments requires efficient Task and Motion Planning (TAMP) approaches, which combine high-level symbolic plan with low-level motion planning. Advances in Large Language Models (LLMs), such as GPT-4, are transforming task planning by offering natural language as an intuitive and flexible way to describe tasks, generate symbolic plans, and reason. However, the effectiveness of LLM-based TAMP approaches is limited due to static and template-based prompting, which struggles in adapting to dynamic environments and complex task contexts. To address these limitations, this work proposes a novel ontology-driven prompt-tuning framework that employs knowledge-based reasoning to refine and expand user prompts with task contextual reasoning and knowledge-based environment state descriptions. Integrating domain-specific knowledge into the prompt ensures semantically accurate and context-aware task plans. The proposed framework demonstrates its effectiveness by resolving semantic errors in symbolic plan generation, such as maintaining logical temporal goal ordering in scenarios involving hierarchical object placement. The proposed framework is validated through both simulation and real-world scenarios, demonstrating significant improvements over the baseline approach in terms of adaptability to dynamic environments, and the generation of semantically correct task plans.
RUMC: A Rule-based Classifier Inspired by Evolutionary Methods
As the field of data analysis grows rapidly due to the large amounts The Rule Aggregating ClassifiER (RACER) [7] is a rule-based of data being generated, effective data classification has become increasingly classification algorithm that generates initial rules from training important. This paper introduces the RUle Mutation Classifier dataset records with the same mechanism. However, these rules (RUMC), which represents a significant improvement over the tend to be too specific, making them less effective for classifying Rule Aggregation ClassifiER (RACER). RUMC uses innovative rule new data, particularly when working with small datasets that have mutation techniques based on evolutionary methods to improve few distinct instances. To address this challenge, I introduce the classification accuracy. In tests with forty datasets from OpenML RUle Mutation Classifier (RUMC), a novel algorithm that enhances and the UCI Machine Learning Repository, RUMC consistently outperformed the capabilities of RACER. RUMC aims to improve the handling of twenty other well-known classifiers, demonstrating its various datasets, including high-dimensional and low-sample-size ability to uncover valuable insights from complex data.