Goto

Collaborating Authors

 Government


Unlearning Imperative: Securing Trustworthy and Responsible LLMs through Engineered Forgetting

arXiv.org Artificial Intelligence

The growing use of large language models in sensitive domains has exposed a critical weakness: the inability to ensure that private information can be permanently forgotten. Yet these systems still lack reliable mechanisms to guarantee that sensitive information can be permanently removed once it has been used. Retraining from the beginning is prohibitively costly, and existing unlearning methods remain fragmented, difficult to verify, and often vulnerable to recovery. This paper surveys recent research on machine unlearning for LLMs and considers how far current approaches can address these challenges. We review methods for evaluating whether forgetting has occurred, the resilience of unlearned models against adversarial attacks, and mechanisms that can support user trust when model complexity or proprietary limits restrict transparency. Technical solutions such as differential privacy, homomorphic encryption, federated learning, and ephemeral memory are examined alongside institutional safeguards including auditing practices and regulatory frameworks. The review finds steady progress, but robust and verifiable unlearning is still unresolved. Efficient techniques that avoid costly retraining, stronger defenses against adversarial recovery, and governance structures that reinforce accountability are needed if LLMs are to be deployed safely in sensitive applications. By integrating technical and organizational perspectives, this study outlines a pathway toward AI systems that can be required to forget, while maintaining both privacy and public trust.


From Street to Orbit: Training-Free Cross-View Retrieval via Location Semantics and LLM Guidance

arXiv.org Artificial Intelligence

Cross-view image retrieval, particularly street-to-satellite matching, is a critical task for applications such as autonomous navigation, urban planning, and localization in GPS-denied environments. However, existing approaches often require supervised training on curated datasets and rely on panoramic or UA V-based images, which limits real-world deployment. In this paper, we present a simple yet effective cross-view image retrieval framework that leverages a pretrained vision encoder and a large language model (LLM), requiring no additional training. Given a monocular street-view image, our method extracts geographic cues through web-based image search and LLM-based location inference, generates a satellite query via geocoding API, and retrieves matching tiles using a pretrained vision encoder (e.g., DINOv2) with PCA-based whitening feature refinement. Despite using no ground-truth supervision or finetuning, our proposed method outperforms prior learning-based approaches on the benchmark dataset under zero-shot settings. Moreover, our pipeline enables automatic construction of semantically aligned street-to-satellite datasets, which is offering a scalable and cost-efficient alternative to manual annotation. All source codes will be made publicly available at https://jeonghomin.github.io/


Predicate-Argument Structure Divergences in Chinese and English Parallel Sentences and their Impact on Language Transfer

arXiv.org Artificial Intelligence

Cross-lingual Natural Language Processing (NLP) has gained significant traction in recent years, offering practical solutions in low-resource settings by transferring linguistic knowledge from resource-rich to low-resource languages. This field leverages techniques like annotation projection and model transfer for language adaptation, supported by multilingual pre-trained language models. However, linguistic divergences hinder language transfer, especially among typologically distant languages. In this paper, we present an analysis of predicate-argument structures in parallel Chinese and English sentences. We explore the alignment and misalignment of predicate annotations, inspecting similarities and differences and proposing a categorization of structural divergences. The analysis and the categorization are supported by a qualitative and quantitative analysis of the results of an annotation projection experiment, in which, in turn, one of the two languages has been used as source language to project annotations into the corresponding parallel sentences. The results of this analysis show clearly that language transfer is asymmetric. An aspect that requires attention when it comes to selecting the source language in transfer learning applications and that needs to be investigated before any scientific claim about cross-lingual NLP is proposed.


A Robust Task-Level Control Architecture for Learned Dynamical Systems

arXiv.org Artificial Intelligence

Dynamical system (DS)-based learning from demonstration (LfD) is a powerful tool for generating motion plans in the operation (`task') space of robotic systems. However, the realization of the generated motion plans is often compromised by a ''task-execution mismatch'', where unmodeled dynamics, persistent disturbances, and system latency cause the robot's actual task-space state to diverge from the desired motion trajectory. We propose a novel task-level robust control architecture, L1-augmented Dynamical Systems (L1-DS), that explicitly handles the task-execution mismatch in tracking a nominal motion plan generated by any DS-based LfD scheme. Our framework augments any DS-based LfD model with a nominal stabilizing controller and an L1 adaptive controller. Furthermore, we introduce a windowed Dynamic Time Warping (DTW)-based target selector, which enables the nominal stabilizing controller to handle temporal misalignment for improved phase-consistent tracking. We demonstrate the efficacy of our architecture on the LASA and IROS handwriting datasets.


Privacy-Preserving Explainable AIoT Application via SHAP Entropy Regularization

arXiv.org Artificial Intelligence

The widespread integration of Artificial Intelligence of Things (AIoT) in smart home environments has amplified the demand for transparent and interpretable machine learning models. To foster user trust and comply with emerging regulatory frameworks, the Explainable AI (XAI) methods, particularly post-hoc techniques such as SHapley Additive exPlanations (SHAP), and Local Interpretable Model-Agnostic Explanations (LIME), are widely employed to elucidate model behavior. However, recent studies have shown that these explanation methods can inadvertently expose sensitive user attributes and behavioral patterns, thereby introducing new privacy risks. To address these concerns, we propose a novel privacy-preserving approach based on SHAP entropy regularization to mitigate privacy leakage in explainable AIoT applications. Our method incorporates an entropy-based regularization objective that penalizes low-entropy SHAP attribution distributions during training, promoting a more uniform spread of feature contributions. To evaluate the effectiveness of our approach, we developed a suite of SHAP-based privacy attacks that strategically leverage model explanation outputs to infer sensitive information. We validate our method through comparative evaluations using these attacks alongside utility metrics on benchmark smart home energy consumption datasets. Experimental results demonstrate that SHAP entropy regularization substantially reduces privacy leakage compared to baseline models, while maintaining high predictive accuracy and faithful explanation fidelity. This work contributes to the development of privacy-preserving explainable AI techniques for secure and trustworthy AIoT applications.


Assessing the Applicability of Natural Language Processing to Traditional Social Science Methodology: A Case Study in Identifying Strategic Signaling Patterns in Presidential Directives

arXiv.org Artificial Intelligence

Our research investigates how Natural Language Processing (NLP) can be u sed to extract main topics from a larger corpus of written data, as applied to the case of identifying signaling themes in Presidential Directives (PDs) from the Reagan through Clinton administrations . Analysts and NLP both identified relevant documents, demonstrating the potential utility of NLPs in research involving large written corpuses. H owever, we also identified discrepancies between NLP and human - labeled results that indicate a need for more research to assess the validity of NLP in this use case . The research was conducted in 2023, and the rapidly evolving landscape of AIML means existing tools have improved and new tools have been developed; this research displays the inherent capabilities of a potentially dated AI tool in emerging social science applications .


A Fourier-Based Global Denoising Model for Smart Artifacts Removing of Microscopy Images

arXiv.org Artificial Intelligence

Microscopy such as Scanning Tunneling Microscopy (STM), Atomic Force Microscopy (AFM) and Scanning Electron Microscopy (SEM) are essential tools in material imaging at micro- and nanoscale resolutions to extract physical knowledge and materials structure-property relationships. However, tuning microscopy controls (e.g. scanning speed, current setpoint, tip bias etc.) to obtain a high-quality of images is a non-trivial and time-consuming effort. On the other hand, with sub-standard images, the key features are not accurately discovered due to noise and artifacts, leading to erroneous analysis. Existing denoising models mostly build on generalizing the weak signals as noises while the strong signals are enhanced as key features, which is not always the case in microscopy images, thus can completely erase a significant amount of hidden physical information. To address these limitations, we propose a global denoising model (GDM) to smartly remove artifacts of microscopy images while preserving weaker but physically important features. The proposed model is developed based on 1) first designing a two-imaging input channel of non-pair and goal specific pre-processed images with user-defined trade-off information between two channels and 2) then integrating a loss function of pixel- and fast Fourier-transformed (FFT) based on training the U-net model. We compared the proposed GDM with the non-FFT denoising model over STM-generated images of Copper(Cu) and Silicon(Si) materials, AFM-generated Pantoea sp.YR343 bio-film images and SEM-generated plastic degradation images. We believe this proposed workflow can be extended to improve other microscopy image quality and will benefit the experimentalists with the proposed design flexibility to smartly tune via domain-experts preferences.


An explainable Recursive Feature Elimination to detect Advanced Persistent Threats using Random Forest classifier

arXiv.org Artificial Intelligence

V. CONCLUSION This study developed an interpretable Intrusion Detection System (IDS) capable of detecting Advanced Persistent Threats (APTs) with high accuracy. By integrating Recursive Feature Elimination (RFE) and Random Forest (RF), the framework efficiently reduced dimensionality and improved detection performance . SHapley Additive exPlanations (SHAP) was integrated to provide both global and instance - level interpretability, enabling transparent insight into the model's decision - making process. Experimental evaluation demonstrated that the system achieved a detection accuracy of 99.9% and exhibited robust performance . Future work will evaluate the proposed RF - RFE framework in real - time deployment environments, where rapid response is crucial . Deep learning and ensemble - based models, such as Long Short - Term Memory (LSTM) networks can be explored to better capture temporal patterns in evolving cyber threats. These enhancements can improve the system's effectiveness and operational relevance in real - world intrusion detection scenarios. The framework will also be benchmarked against advanced classifiers, including LSTM, XGBoost, and ge nerative AI - based techniques to compare performance in terms of accuracy, interpretability, and adaptability.


Enhancing Password Security Through a High-Accuracy Scoring Framework Using Random Forests

arXiv.org Artificial Intelligence

Password security plays a crucial role in cybersecurity, yet traditional password strength meters, which rely on static rules like character - type requirements, often fail . Such methods are easily bypassed by common password patterns (e.g., 'P@ssw0rd1!'), giving users a false sense of security . To address this, we implement and evaluate a password strength scoring system by comparing four machine learning models: Random Forest (RF), Support Vector Machine (SVM), a Convolutional Neural Network (CNN), and Logistic Regression with a dataset of over 660,000 real - world passwords. Our primary contribution is a novel hybrid feature engineering approach that captures nuanced vulnerabilities missed by standard metrics . We introduce features like leetspeak - normalized Shannon entropy to assess true randomness, pattern detection for keyboard walks and sequences, and character - level TF - IDF n - grams to identify frequently reused substrings from breached password datasets. Crucially, the interpretability of the Random Forest model allows for feature importance analysis, providing a clear pathway to developing security tools that offer specific, actionable feedback to users. This study bridges the gap betwee n predictive accuracy and practical usability, resulting in a high - performance scoring system that not only reduces password - based vulnerabilities but also empowers users to make more informed security decisions. Keywords - Password Security, Machine Learning, Rule - Based Attack, Brute - Force Attack, Dictionary Attack, Cybersecurity. 1. P asswords remain a cornerstone of online security, serving as the primary means of authentication for countless systems and applications . However, this reliance is a critical vulnerability; according to a report by Google Cloud, a staggering 86% of breaches involve stolen credentials, posing a significant threat to both user data and system security .[1] M any users choose weak, easily guessable passwords, which pose a serious threat to both user data and system security . Attackers frequently exploit this vulnerability in large - scale attacks, compromising user privacy and enabling financial fraud . Most traditional password strength scoring tools rely on static rules, such as requiring a mix of lowercase, uppercase, digits, and special characters (LUDS), which fail to adapt to evolving attack patterns .


SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models

arXiv.org Artificial Intelligence

Refusal refers to the functional behavior enabling safety-aligned language models to reject harmful or unethical prompts. Following the growing scientific interest in mechanistic interpretability, recent work encoded refusal behavior as a single direction in the model's latent space; e.g., computed as the difference between the centroids of harmful and harmless prompt representations. However, emerging evidence suggests that concepts in LLMs often appear to be encoded as a low-dimensional manifold embedded in the high-dimensional latent space. Motivated by these findings, we propose a novel method leveraging Self-Organizing Maps (SOMs) to extract multiple refusal directions. To this end, we first prove that SOMs generalize the prior work's difference-in-means technique. We then train SOMs on harmful prompt representations to identify multiple neurons. By subtracting the centroid of harmless representations from each neuron, we derive a set of multiple directions expressing the refusal concept. We validate our method on an extensive experimental setup, demonstrating that ablating multiple directions from models' internals outperforms not only the single-direction baseline but also specialized jailbreak algorithms, leading to an effective suppression of refusal. Finally, we conclude by analyzing the mechanistic implications of our approach.