Goto

Collaborating Authors

 Expert Systems


Merging Language and Domain Specific Models: The Impact on Technical Vocabulary Acquisition

arXiv.org Artificial Intelligence

This paper investigates the integration of technical vocabulary in merged language models. We explore the knowledge transfer mechanisms involved when combining a general-purpose language-specific model with a domain-specific model, focusing on the resulting model's comprehension of technical jargon. Our experiments analyze the impact of this merging process on the target model's proficiency in handling specialized terminology. We present a quantitative evaluation of the performance of the merged model, comparing it with that of the individual constituent models. The findings offer insights into the effectiveness of different model merging methods for enhancing domain-specific knowledge and highlight potential challenges and future directions in leveraging these methods for cross-lingual knowledge transfer in Natural Language Processing.


Accelerating Anchors via Specialization and Feature Transformation

arXiv.org Artificial Intelligence

Anchors is a popular local model-agnostic explanation technique whose applicability is limited by its computational inefficiency. To address this limitation, we propose a pre-training-based approach to accelerate Anchors without compromising the explanation quality. Our approach leverages the iterative nature of Anchors' algorithm which gradually refines an explanation until it is precise enough for a given input by providing a general explanation that is obtained through pre-training as Anchors' initial explanation. Specifically, we develop a two-step rule transformation process: the horizontal transformation adapts a pre-trained explanation to the current input by replacing features, and the vertical transformation refines the general explanation until it is precise enough for the input. We evaluate our method across tabular, text, and image datasets, demonstrating that it significantly reduces explanation generation time while maintaining fidelity and interpretability, thereby enabling the practical adoption of Anchors in time-sensitive applications.


Recent Advances in Malware Detection: Graph Learning and Explainability

arXiv.org Artificial Intelligence

The rapid evolution of malware has necessitated the development of sophisticated detection methods that go beyond traditional signature-based approaches. Graph learning techniques have emerged as powerful tools for modeling and analyzing the complex relationships inherent in malware behavior, leveraging advancements in Graph Neural Networks (GNNs) and related methods. This survey provides a comprehensive exploration of recent advances in malware detection, focusing on the interplay between graph learning and explainability. It begins by reviewing malware analysis techniques and datasets, emphasizing their foundational role in understanding malware behavior and supporting detection strategies. The survey then discusses feature engineering, graph reduction, and graph embedding methods, highlighting their significance in transforming raw data into actionable insights, while ensuring scalability and efficiency. Furthermore, this survey focuses on explainability techniques and their applications in malware detection, ensuring transparency and trustworthiness. By integrating these components, this survey demonstrates how graph learning and explainability contribute to building robust, interpretable, and scalable malware detection systems. Future research directions are outlined to address existing challenges and unlock new opportunities in this critical area of cybersecurity.


Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking

arXiv.org Artificial Intelligence

The pervasiveness of large language models and generative AI in online media has amplified the need for effective automated fact-checking to assist fact-checkers in tackling the increasing volume and sophistication of misinformation. The complex nature of fact-checking demands that automated fact-checking systems provide explanations that enable fact-checkers to scrutinise their outputs. However, it is unclear how these explanations should align with the decision-making and reasoning processes of fact-checkers to be effectively integrated into their workflows. Through semi-structured interviews with fact-checking professionals, we bridge this gap by: (i) providing an account of how fact-checkers assess evidence, make decisions, and explain their processes; (ii) examining how fact-checkers use automated tools in practice; and (iii) identifying fact-checker explanation requirements for automated fact-checking tools. The findings show unmet explanation needs and identify important criteria for replicable fact-checking explanations that trace the model's reasoning path, reference specific evidence, and highlight uncertainty and information gaps.


Knowledge Integration Strategies in Autonomous Vehicle Prediction and Planning: A Comprehensive Survey

arXiv.org Artificial Intelligence

This comprehensive survey examines the integration of knowledge-based approaches into autonomous driving systems, with a focus on trajectory prediction and planning. We systematically review methodologies for incorporating domain knowledge, traffic rules, and commonsense reasoning into these systems, spanning purely symbolic representations to hybrid neuro-symbolic architectures. In particular, we analyze recent advancements in formal logic and differential logic programming, reinforcement learning frameworks, and emerging techniques that leverage large foundation models and diffusion models for knowledge representation. Organized under a unified literature survey section, our discussion synthesizes the state-of-the-art into a high-level overview, supported by a detailed comparative table that maps key works to their respective methodological categories. This survey not only highlights current trends -- including the growing emphasis on interpretable AI, formal verification in safety-critical systems, and the increased use of generative models in prediction and planning -- but also outlines the challenges and opportunities for developing robust, knowledge-enhanced autonomous driving systems.


Advancing machine fault diagnosis: A detailed examination of convolutional neural networks

arXiv.org Artificial Intelligence

The growing complexity of machinery and the increasing demand for operational efficiency and safety have driven the development of advanced fault diagnosis techniques. Among these, convolutional neural networks (CNNs) have emerged as a powerful tool, offering robust and accurate fault detection and classification capabilities. This comprehensive review delves into the application of CNNs in machine fault diagnosis, covering its theoretical foundation, architectural variations, and practical implementations. The strengths and limitations of CNNs are analyzed in this domain, discussing their effectiveness in handling various fault types, data complexities, and operational environments. Furthermore, we explore the evolving landscape of CNN-based fault diagnosis, examining recent advancements in data augmentation, transfer learning, and hybrid architectures. Finally, we highlight future research directions and potential challenges to further enhance the application of CNNs for reliable and proactive machine fault diagnosis.


Model-Free Counterfactual Subset Selection at Scale

arXiv.org Artificial Intelligence

Ensuring transparency in AI decision-making requires interpretable explanations, particularly at the instance level. Counterfactual explanations are a powerful tool for this purpose, but existing techniques frequently depend on synthetic examples, introducing biases from unrealistic assumptions, flawed models, or skewed data. Many methods also assume full dataset availability, an impractical constraint in real-time environments where data flows continuously. In contrast, streaming explanations offer adaptive, real-time insights without requiring persistent storage of the entire dataset. This work introduces a scalable, model-free approach to selecting diverse and relevant counterfactual examples directly from observed data. Our algorithm operates efficiently in streaming settings, maintaining $O(\log k)$ update complexity per item while ensuring high-quality counterfactual selection. Empirical evaluations on both real-world and synthetic datasets demonstrate superior performance over baseline methods, with robust behavior even under adversarial conditions.


Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations

arXiv.org Artificial Intelligence

Transparency and interpretability are crucial for enhancing customer confidence and user engagement, especially when dealing with black-box Machine Learning (ML)-based recommendation systems. Modern recommendation systems leverage Graph Neural Network (GNN) due to their ability to produce high-quality recommendations in terms of both relevance and diversity. Therefore, the explainability of GNN is especially important for Link Prediction (LP) tasks since recommending relevant items can be viewed as predicting links between users and items. GNN explainability has been a well-studied field, existing methods primarily focus on node or graph-level tasks, leaving a gap in LP explanation techniques. This work introduces Z-REx, a GNN explanation framework designed explicitly for heterogeneous link prediction tasks. Z-REx utilizes structural and attribute perturbation to identify critical sub-structures and important features while reducing the search space by leveraging domain-specific knowledge. In our experimentation, we show the efficacy of Z-REx in generating contextually relevant and human-interpretable explanations for ZiGNN, a GNN-based recommendation engine, using a real-world real-estate dataset from Zillow Group, Inc. We also compare Z-REx to State-of-The-Art (SOTA) GNN explainers to show Z-REx's superiority in producing high-quality human-interpretable explanations.


Integrating Artificial Intelligence and Geophysical Insights for Earthquake Forecasting: A Cross-Disciplinary Review

arXiv.org Artificial Intelligence

Earthquake forecasting remains a significant scientific challenge, with current methods falling short of achieving the performance necessary for meaningful societal benefits. Traditional models, primarily based on past seismicity and geomechanical data, struggle to capture the complexity of seismic patterns and often overlook valuable non-seismic precursors such as geophysical, geochemical, and atmospheric anomalies. The integration of such diverse data sources into forecasting models, combined with advancements in AI technologies, offers a promising path forward. AI methods, particularly deep learning, excel at processing complex, large-scale datasets, identifying subtle patterns, and handling multidimensional relationships, making them well-suited for overcoming the limitations of conventional approaches. This review highlights the importance of combining AI with geophysical knowledge to create robust, physics-informed forecasting models. It explores current AI methods, input data types, loss functions, and practical considerations for model development, offering guidance to both geophysicists and AI researchers. While many AI-based studies oversimplify earthquake prediction, neglecting critical features such as data imbalance and spatio-temporal clustering, the integration of specialized geophysical insights into AI models can address these shortcomings. We emphasize the importance of interdisciplinary collaboration, urging geophysicists to experiment with AI architectures thoughtfully and encouraging AI experts to deepen their understanding of seismology. By bridging these disciplines, we can develop more accurate, reliable, and societally impactful earthquake forecasting tools.


Combining Large Language Models with Static Analyzers for Code Review Generation

arXiv.org Artificial Intelligence

Code review is a crucial but often complex, subjective, and time-consuming activity in software development. Over the past decades, significant efforts have been made to automate this process. Early approaches focused on knowledge-based systems (KBS) that apply rule-based mechanisms to detect code issues, providing precise feedback but struggling with complex, context-dependent cases. More recent work has shifted toward fine-tuning pre-trained language models for code review, enabling broader issue coverage but often at the expense of precision. In this paper, we propose a hybrid approach that combines the strengths of KBS and learning-based systems (LBS) to generate high-quality, comprehensive code reviews. Our method integrates knowledge at three distinct stages of the language model pipeline: during data preparation (Data-Augmented Training, DAT), at inference (Retrieval-Augmented Generation, RAG), and after inference (Naive Concatenation of Outputs, NCO). We empirically evaluate our combination strategies against standalone KBS and LBS fine-tuned on a real-world dataset. Our results show that these hybrid strategies enhance the relevance, completeness, and overall quality of review comments, effectively bridging the gap between rule-based tools and deep learning models.