Goto

Collaborating Authors

 Expert Systems


Towards an Understanding of Context Utilization in Code Intelligence

arXiv.org Artificial Intelligence

Code intelligence is an emerging domain in software engineering, aiming to improve the effectiveness and efficiency of various code-related tasks. Recent research suggests that incorporating contextual information beyond the basic original task inputs (i.e., source code) can substantially enhance model performance. Such contextual signals may be obtained directly or indirectly from sources such as API documentation or intermediate representations like abstract syntax trees can significantly improve the effectiveness of code intelligence. Despite growing academic interest, there is a lack of systematic analysis of context in code intelligence. To address this gap, we conduct an extensive literature review of 146 relevant studies published between September 2007 and August 2024. Our investigation yields four main contributions. (1) A quantitative analysis of the research landscape, including publication trends, venues, and the explored domains; (2) A novel taxonomy of context types used in code intelligence; (3) A task-oriented analysis investigating context integration strategies across diverse code intelligence tasks; (4) A critical evaluation of evaluation methodologies for context-aware methods. Based on these findings, we identify fundamental challenges in context utilization in current code intelligence systems and propose a research roadmap that outlines key opportunities for future research.


Towards Responsible and Trustworthy Educational Data Mining: Comparing Symbolic, Sub-Symbolic, and Neural-Symbolic AI Methods

arXiv.org Artificial Intelligence

Given the demand for responsible and trustworthy AI for education, this study evaluates symbolic, sub-symbolic, and neural-symbolic AI (NSAI) in terms of generalizability and interpretability. Our extensive experiments on balanced and imbalanced self-regulated learning datasets of Estonian primary school students predicting 7th-grade mathematics national test performance showed that symbolic and sub-symbolic methods performed well on balanced data but struggled to identify low performers in imbalanced datasets. Interestingly, symbolic and sub-symbolic methods emphasized different factors in their decision-making: symbolic approaches primarily relied on cognitive and motivational factors, while sub-symbolic methods focused more on cognitive aspects, learnt knowledge, and the demographic variable of gender -- yet both largely overlooked metacognitive factors. The NSAI method, on the other hand, showed advantages by: (i) being more generalizable across both classes -- even in imbalanced datasets -- as its symbolic knowledge component compensated for the underrepresented class; and (ii) relying on a more integrated set of factors in its decision-making, including motivation, (meta)cognition, and learnt knowledge, thus offering a comprehensive and theoretically grounded interpretability framework. These contrasting findings highlight the need for a holistic comparison of AI methods before drawing conclusions based solely on predictive performance. They also underscore the potential of hybrid, human-centred NSAI methods to address the limitations of other AI families and move us closer to responsible AI for education. Specifically, by enabling stakeholders to contribute to AI design, NSAI aligns learned patterns with theoretical constructs, incorporates factors like motivation and metacognition, and strengthens the trustworthiness and responsibility of educational data mining.


Pro-life journalist assaulted on street assigns blame to Democratic rhetoric

FOX News

'Live Action' journalist Savannah Craven Antao speaks out after being punched by an interviewee on'The Will Cain Show.' Pro-life activist Savannah Craven Antao believes the Democratic Party's recent rhetoric about "punching" at their Republican opponents contributed to the attack that left her bloody during a recent interview. Antao, a young pro-life influencer who was punched in the face by a woman she was interviewing in New York City earlier this month, pointed to Rep. Jasmine Crockett's, D-Texas, recent line about Democrats "punching" as inspiring the attack that happened to her. "She said, 'I think that you punch,'" Antao told Fox News Digital. "'I think you're okay with punching.' So yeah – pretty much just describes the left at this point. They're totally fine with just using force like that to hurt people if they don't agree with them."


Data over dialogue: Why artificial intelligence is unlikely to humanise medicine

arXiv.org Artificial Intelligence

Recently, a growing number of experts in artificial intelligence (AI) and medicine have be-gun to suggest that the use of AI systems, particularly machine learning (ML) systems, is likely to humanise the practice of medicine by substantially improving the quality of clinician-patient relationships. In this thesis, however, I argue that medical ML systems are more likely to negatively impact these relationships than to improve them. In particular, I argue that the use of medical ML systems is likely to comprise the quality of trust, care, empathy, understanding, and communication between clinicians and patients.


AiGAS-dEVL-RC: An Adaptive Growing Neural Gas Model for Recurrently Drifting Unsupervised Data Streams

arXiv.org Artificial Intelligence

--Concept drift and extreme verification latency pose significant challenges in data stream learning, particularly when dealing with recurring concept changes in dynamic environments. This work introduces a novel method based on the Growing Neural Gas (GNG) algorithm, designed to effectively handle abrupt recurrent drifts while adapting to incrementally evolving data distributions (incremental drifts). Leveraging the self-organizing and topological adaptability of GNG, the proposed approach maintains a compact yet informative memory structure, allowing it to efficiently store and retrieve knowledge of past or recurring concepts, even under conditions of delayed or sparse stream supervision. Our experiments highlight the superiority of our approach over existing data stream learning methods designed to cope with incremental non-stationarities and verification latency, demonstrating its ability to quickly adapt to new drifts, robustly manage recurring patterns, and maintain high predictive accuracy with a minimal memory footprint. Unlike other techniques that fail to leverage recurring knowledge, our proposed approach is proven to be a robust and efficient online learning solution for unsupervised drifting data flows. Data stream learning has become increasingly relevant in a variety of real-world applications, ranging from fraud detection and stock market analysis to personalized recommendations and industrial process monitoring [1]. These systems rely on continuous real-time processing of data streams to make predictions or decisions. Unlike static datasets, data streams are often characterized by their unbounded, high-speed nature, which necessitates models that can operate incrementally, efficiently, and with minimal reliance on labeled data. Ensuring that such models remain accurate and adaptive over time is crucial for maintaining the performance of systems operating in dynamic environments [2]-[4].


Interactive Explanations for Reinforcement-Learning Agents

arXiv.org Artificial Intelligence

As reinforcement learning methods increasingly amass accomplishments, the need for comprehending their solutions becomes more crucial. Most explainable reinforcement learning (XRL) methods generate a static explanation depicting their developers' intuition of what should be explained and how. In contrast, literature from the social sciences proposes that meaningful explanations are structured as a dialog between the explainer and the explainee, suggesting a more active role for the user and her communication with the agent. In this paper, we present ASQ-IT -- an interactive explanation system that presents video clips of the agent acting in its environment based on queries given by the user that describe temporal properties of behaviors of interest. Our approach is based on formal methods: queries in ASQ-IT's user interface map to a fragment of Linear Temporal Logic over finite traces (LTLf), which we developed, and our algorithm for query processing is based on automata theory. User studies show that end-users can understand and formulate queries in ASQ-IT and that using ASQ-IT assists users in identifying faulty agent behaviors.


Knowledge-Base based Semantic Image Transmission Using CLIP

arXiv.org Artificial Intelligence

This paper proposes a novel knowledge-Base (KB) assisted semantic communication framework for image transmission. At the receiver, a Facebook AI Similarity Search (FAISS) based vector database is constructed by extracting semantic embeddings from images using the Contrastive Language-Image Pre-Training (CLIP) model. During transmission, the transmitter first extracts a 512-dimensional semantic feature using the CLIP model, then compresses it with a lightweight neural network for transmission. After receiving the signal, the receiver reconstructs the feature back to 512 dimensions and performs similarity matching from the KB to retrieve the most semantically similar image. Semantic transmission success is determined by category consistency between the transmitted and retrieved images, rather than traditional metrics like Peak Signal-to-Noise Ratio (PSNR). The proposed system prioritizes semantic accuracy, offering a new evaluation paradigm for semantic-aware communication systems. Experimental validation on CIFAR100 demonstrates the effectiveness of the framework in achieving semantic image transmission.


Which LIME should I trust? Concepts, Challenges, and Solutions

arXiv.org Artificial Intelligence

As neural networks become dominant in essential systems, Explainable Artificial Intelligence (XAI) plays a crucial role in fostering trust and detecting potential misbehavior of opaque models. LIME (Local Interpretable Model-agnostic Explanations) is among the most prominent model-agnostic approaches, generating explanations by approximating the behavior of black-box models around specific instances. Despite its popularity, LIME faces challenges related to fidelity, stability, and applicability to domain-specific problems. Numerous adaptations and enhancements have been proposed to address these issues, but the growing number of developments can be overwhelming, complicating efforts to navigate LIME-related research. To the best of our knowledge, this is the first survey to comprehensively explore and collect LIME's foundational concepts and known limitations. We categorize and compare its various enhancements, offering a structured taxonomy based on intermediate steps and key issues. Our analysis provides a holistic overview of advancements in LIME, guiding future research and helping practitioners identify suitable approaches. Additionally, we provide a continuously updated interactive website (https://patrick-knab.github.io/which-lime-to-trust/), offering a concise and accessible overview of the survey.


An End-to-End Comprehensive Gear Fault Diagnosis Method Based on Multi-Scale Feature-Level Fusion Strategy

arXiv.org Artificial Intelligence

To satisfy the requirements of the end-to-end fault diagnosis of gears, an integrated intelligent method of fault diagnosis for gears using acceleration signals was proposed, which was based on Gabor-based Adaptive Short-Time Fourier Transform (Gabor-ASTFT) and Dual-Tree Complex Wavelet Transform(DTCWT) algorithms, Dilated Residual structure and feature fusion layer, is proposed in this paper. Initially, the raw one-dimensional acceleration signals collected from the gearbox base using vibration sensors undergo pre-segmentation processing. The Gabor-ASTFT and DTCWT are then applied to convert the original one-dimensional time-domain signals into two-dimensional time-frequency representations, facilitating the preliminary extraction of fault features and obtaining weak feature maps.Subsequently, a dual-channel structure is established using deconvolution and dilated convolution to perform upsampling and downsampling on the feature maps, adjusting their sizes accordingly. A feature fusion layer is then constructed to integrate the dual-channel features, enabling multi-scale analysis of the extracted fault features.Finally, a convolutional neural network (CNN) model incorporating a residual structure is developed to conduct deep feature extraction from the fused feature maps. The extracted features are subsequently fed into a Global Average Pooling(GAP) and a classification function for fault classification. Conducting comparative experiments on different datasets, the proposed method is demonstrated to effectively meet the requirements of end-to-end fault diagnosis for gears.


RARE: Retrieval-Augmented Reasoning Modeling

arXiv.org Artificial Intelligence

Domain-specific intelligence demands specialized knowledge and sophisticated reasoning for problem-solving, posing significant challenges for large language models (LLMs) that struggle with knowledge hallucination and inadequate reasoning capabilities under constrained parameter budgets. Inspired by Bloom's Taxonomy in educational theory, we propose Retrieval-Augmented Reasoning Modeling (RARE), a novel paradigm that decouples knowledge storage from reasoning optimization. RARE externalizes domain knowledge to retrievable sources and internalizes domain-specific reasoning patterns during training. Specifically, by injecting retrieved knowledge into training prompts, RARE transforms learning objectives from rote memorization to contextualized reasoning application. It enables models to bypass parameter-intensive memorization and prioritize the development of higher-order cognitive processes. Our experiments demonstrate that lightweight RARE-trained models (e.g., Llama-3.1-8B) could achieve state-of-the-art performance, surpassing retrieval-augmented GPT-4 and Deepseek-R1 distilled counterparts. RARE establishes a paradigm shift where maintainable external knowledge bases synergize with compact, reasoning-optimized models, collectively driving more scalable domain-specific intelligence. Repo: https://github.com/Open-DataFlow/RARE