Goto

Collaborating Authors

 Government


Sentinel: Dynamic Knowledge Distillation for Personalized Federated Intrusion Detection in Heterogeneous IoT Networks

arXiv.org Artificial Intelligence

Abstract--Federated learning (FL) offers a privacy-preserving paradigm for machine learning, but its application in intrusion detection systems (IDS) within IoT networks is challenged by severe class imbalance, non-IID data, and high communication overhead.These challenges severely degrade the performance of conventional FL methods in real-world network traffic classification. T o overcome these limitations, we propose Sentinel, a personalized federated IDS (pFed-IDS) framework that incorporates a dual-model architecture on each client, consisting of a personalized teacher and a lightweight shared student model. This design effectively balances deep local adaptation with efficient global model consensus while preserving client privacy by transmitting only the compact student model, thus reducing communication costs. Sentinel integrates three key mechanisms to ensure robust performance: bidirectional knowledge distillation with adaptive temperature scaling, multi-faceted feature alignment, and class-balanced loss functions. Furthermore, the server employs normalized gradient aggregation with equal client weighting to enhance fairness and mitigate client drift. Extensive experiments on the IoTID20 and 5GNIDD benchmark datasets demonstrate that Sentinel significantly outperforms state-of-the-art federated methods, establishing a new performance benchmark, especially under extreme data heterogeneity, while maintaining communication efficiency. HE rapid proliferation of billions of heterogeneous Internet of Things (IoT) devices has significantly expanded attack surfaces, presenting new challenges for network security. Insufficient security measures in many of these devices--such as inadequate authentication, weak encryption, and vulnerable communication protocols--facilitate a continuous influx of various cyberattacks, including novel (zero-day) threats, which pose significant risks to the availability, confidentiality, and integrity of data and systems. Traditional firewalls and encryption of the security system are not enough to prevent increasing cyber attacks [1].


Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction

arXiv.org Artificial Intelligence

Recently, semantically constrained adversarial examples (SemanticAE), which are directly generated from natural language instructions, have become a promising avenue for future research due to their flexible attacking forms. To generate SemanticAEs, current methods fall short of satisfactory attacking ability as the key underlying factors of semantic uncertainty in human instructions, such as referring diversity, descriptive incompleteness, and boundary ambiguity, have not been fully investigated. To tackle the issues, this paper develops a multi-dimensional instruction uncertainty reduction (InSUR) framework to generate more satisfactory SemanticAE, i.e., transferable, adaptive, and effective. Specifically, in the dimension of the sampling method, we propose the residual-driven attacking direction stabilization to alleviate the unstable adversarial optimization caused by the diversity of language references. By coarsely predicting the language-guided sampling process, the optimization process will be stabilized by the designed ResAdv-DDIM sampler, therefore releasing the transferable and robust adversarial capability of multi-step diffusion models. In task modeling, we propose the context-encoded attacking scenario constraint to supplement the missing knowledge from incomplete human instructions. Guidance masking and renderer integration are proposed to regulate the constraints of 2D/3D SemanticAE, activating stronger scenario-adapted attacks. Moreover, in the dimension of generator evaluation, we propose the semantic-abstracted attacking evaluation enhancement by clarifying the evaluation boundary, facilitating the development of more effective SemanticAE generators. Extensive experiments demonstrate the superiority of the transfer attack performance of InSUR. Moreover, we realize the reference-free generation of semantically constrained 3D adversarial examples for the first time.


Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

arXiv.org Artificial Intelligence

Language models (LMs) often struggle to generate diverse, human-like creative content, raising concerns about the long-term homogenization of human thought through repeated exposure to similar outputs. Yet scalable methods for evaluating LM output diversity remain limited, especially beyond narrow tasks such as random number or name generation, or beyond repeated sampling from a single model. We introduce Infinity-Chat, a large-scale dataset of 26K diverse, real-world, open-ended user queries that admit a wide range of plausible answers with no single ground truth. We introduce the first comprehensive taxonomy for characterizing the full spectrum of open-ended prompts posed to LMs, comprising 6 top-level categories (e.g., brainstorm & ideation) that further breaks down to 17 subcategories. Using Infinity-Chat, we present a large-scale study of mode collapse in LMs, revealing a pronounced Artificial Hivemind effect in open-ended generation of LMs, characterized by (1) intra-model repetition, where a single model consistently generates similar responses, and more so (2) inter-model homogeneity, where different models produce strikingly similar outputs. Infinity-Chat also includes 31,250 human annotations, across absolute ratings and pairwise preferences, with 25 independent human annotations per example. This enables studying collective and individual-specific human preferences in response to open-ended queries. Our findings show that LMs, reward models, and LM judges are less well calibrated to human ratings on model generations that elicit differing idiosyncratic annotator preferences, despite maintaining comparable overall quality. Overall, INFINITY-CHAT presents the first large-scale resource for systematically studying real-world open-ended queries to LMs, revealing critical insights to guide future research for mitigating long-term AI safety risks posed by the Artificial Hivemind.


Hazard-Responsive Digital Twin for Climate-Driven Urban Resilience and Equity

arXiv.org Artificial Intelligence

Complex events such as wildfires, floods, and heatwaves are no longer isolated phenomena but interlinked hazards that propagate through interconnected infrastructure networks. When one system fails, others that depend on it often cascade toward collapse, producing widespread disruption and social inequity. Recent crises including the 2023 Vermont flooding, the 2024 Texas winter freeze, and the 2025 Southern California wildfire illustrate how climate - amplified events can simultaneously strain energy, water, communication, and transportation systems. Traditional risk assessments, which often treat hazards as discrete and static events, are insufficient to capture the evolving and compounding nature of modern disasters. Digital Twin (DT) technology offers a promising avenue for improving situational awareness and decision - making under such conditions. Originally introduced for aerospace engineering and later adopted across industrial sectors, DTs create real - time virtual counterparts of physical systems using sensor data, predictive modeling, and feedback control (Grieves & Vickers, 2018; Tao et al., 2019) . Within the built environment, DTs have been applied to asset monitoring, predictive maintenance, and urban system management (Errandonea et al., 2020; Fogli, 2019; Fuller et al., 2020) . However, most conventional DTs rely on stable connectivity, complete datasets, and deterministic control assumptions that are not held during crises characterized by cascading failures and data disruption. To address these challenges, the concept of the Risk - Informed Digital Twin (RDT) integrates probabilistic modeling, uncertainty quantification, and decision support within the DT architecture (Pignatta & Alibrandi, 2022; Zio & Miqueles, 2024) .


Learning Reconfigurable Representations for Multimodal Federated Learning with Missing Data

arXiv.org Artificial Intelligence

Multimodal federated learning in real-world settings often encounters incomplete and heterogeneous data across clients. This results in misaligned local feature representations that limit the effectiveness of model aggregation. Unlike prior work that assumes either differing modality sets without missing input features or a shared modality set with missing features across clients, we consider a more general and realistic setting where each client observes a different subset of modalities and might also have missing input features within each modality. To address the resulting misalignment in learned representations, we propose a new federated learning framework featuring locally adaptive representations based on learnable client-side embedding controls that encode each client's data-missing patterns. These embeddings serve as reconfiguration signals that align the globally aggregated representation with each client's local context, enabling more effective use of shared information. Furthermore, the embedding controls can be algorithmically aggregated across clients with similar data-missing patterns to enhance the robustness of reconfiguration signals in adapting the global representation. Empirical results on multiple federated multimodal benchmarks with diverse data-missing patterns across clients demonstrate the efficacy of the proposed method, achieving up to 36.45\% performance improvement under severe data incompleteness. The method is also supported by a theoretical analysis with an explicit performance bound that matches our empirical observations. Our source codes are provided at https://github.com/nmduonggg/PEPSY


A Review of End-to-End Precipitation Prediction Using Remote Sensing Data: from Divination to Machine Learning

arXiv.org Artificial Intelligence

Precipitation prediction has undergone a profound transformation -- from early symbolic and empirical methods rooted in divination and observation, to modern technologies based on atmospheric physics and artificial intelligence. This review traces the historical and technological evolution of precipitation forecasting, presenting a survey about end-to-end precipitation prediction technologies that spans ancient practices, the foundations of meteorological science, the rise of numerical weather prediction (NWP), and the emergence of machine learning (ML) and deep learning (DL) models. We first explore traditional and indigenous forecasting methods, then describe the development of physical modeling and statistical frameworks that underpin contemporary operational forecasting. Particular emphasis is placed on recent advances in neural network-based approaches, including automated deep learning, interpretability-driven design, and hybrid physical-data models. By compositing research across multiple eras and paradigms, this review not only depicts the history of end-to-end precipitation prediction but also outlines future directions in next generation forecasting systems.


Drone Carry-on Weight and Wind Flow Assessment via Micro-Doppler Analysis

arXiv.org Artificial Intelligence

Remote monitoring of drones has become a global objective due to emerging applications in national security and managing aerial delivery traffic. Despite their relatively small size, drones can carry significant payloads, which require monitoring, especially in cases of unauthorized transportation of dangerous goods. A drone's flight dynamics heavily depend on outdoor wind conditions and the carry-on weight, which affect the tilt angle of a drone's body and the rotation velocity of the blades. A surveillance radar can capture both effects, provided a sufficient signal-to-noise ratio for the received echoes and an adjusted postprocessing detection algorithm. Here, we conduct a systematic study to demonstrate that micro-Doppler analysis enables the disentanglement of the impacts of wind and weight on a hovering drone. The physics behind the effect is related to the flight controller, as the way the drone counteracts weight and wind differs. When the payload is balanced, it imposes an additional load symmetrically on all four rotors, causing them to rotate faster, thereby generating a blade-related micro-Doppler shift at a higher frequency. However, the impact of the wind is different. The wind attempts to displace the drone, and to counteract this, the drone tilts to the side. As a result, the forward and rear rotors rotate at different velocities to maintain the tilt angle of the drone body relative to the airflow direction. This causes the splitting in the micro-Doppler spectra. By performing a set of experiments in a controlled environment, specifically, an anechoic chamber for electromagnetic isolation and a wind tunnel for imposing deterministic wind conditions, we demonstrate that both wind and payload details can be extracted using a simple deterministic algorithm based on branching in the micro-Doppler spectra.


Cross-Lingual Stability and Bias in Instruction-Tuned Language Models for Humanitarian NLP

arXiv.org Artificial Intelligence

Humanitarian organizations face a critical choice: invest in costly commercial APIs or rely on free open-weight models for multilingual human rights monitoring. While commercial systems offer reliability, open-weight alternatives lack empirical validation -- especially for low-resource languages common in conflict zones. This paper presents the first systematic comparison of commercial and open-weight large language models (LLMs) for human-rights-violation detection across seven languages, quantifying the cost-reliability trade-off facing resource-constrained organizations. Across 78,000 multilingual inferences, we evaluate six models -- four instruction-aligned (Claude-Sonnet-4, DeepSeek-V3, Gemini-Flash-2.0, GPT-4.1-mini) and two open-weight (LLaMA-3-8B, Mistral-7B) -- using both standard classification metrics and new measures of cross-lingual reliability: Calibration Deviation (CD), Decision Bias (B), Language Robustness Score (LRS), and Language Stability Score (LSS). Results show that alignment, not scale, determines stability: aligned models maintain near-invariant accuracy and balanced calibration across typologically distant and low-resource languages (e.g., Lingala, Burmese), while open-weight models exhibit significant prompt-language sensitivity and calibration drift. These findings demonstrate that multilingual alignment enables language-agnostic reasoning and provide practical guidance for humanitarian organizations balancing budget constraints with reliability in multilingual deployment.


MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion

arXiv.org Artificial Intelligence

As Large Vision-Language Models (LVLMs) are increasingly deployed in domains such as shopping, health, and news, they are exposed to pervasive persuasive content. A critical question is how these models function as persuadees-how and why they can be influenced by persuasive multimodal inputs. Understanding both their susceptibility to persuasion and the effectiveness of different persuasive strategies is crucial, as overly persuadable models may adopt misleading beliefs, override user preferences, or generate unethical or unsafe outputs when exposed to manipulative messages. We introduce MMPersuade, a unified framework for systematically studying multimodal persuasion dynamics in LVLMs. MMPersuade contributes (i) a comprehensive multimodal dataset that pairs images and videos with established persuasion principles across commercial, subjective and behavioral, and adversarial contexts, and (ii) an evaluation framework that quantifies both persuasion effectiveness and model susceptibility via third-party agreement scoring and self-estimated token probabilities on conversation histories. Our study of six leading LVLMs as persuadees yields three key insights: (i) multimodal inputs substantially increase persuasion effectiveness-and model susceptibility-compared to text alone, especially in misinformation scenarios; (ii) stated prior preferences decrease susceptibility, yet multimodal information maintains its persuasive advantage; and (iii) different strategies vary in effectiveness across contexts, with reciprocity being most potent in commercial and subjective contexts, and credibility and logic prevailing in adversarial contexts. By jointly analyzing persuasion effectiveness and susceptibility, MMPersuade provides a principled foundation for developing models that are robust, preference-consistent, and ethically aligned when engaging with persuasive multimodal content.


RL-AVIST: Reinforcement Learning for Autonomous Visual Inspection of Space Targets

arXiv.org Artificial Intelligence

The growing need for autonomous on-orbit services such as inspection, maintenance, and situational awareness calls for intelligent spacecraft capable of complex maneuvers around large orbital targets. Traditional control systems often fall short in adaptability, especially under model uncertainties, multi-spacecraft configurations, or dynamically evolving mission contexts. This paper introduces RL-AVIST, a Reinforcement Learning framework for Autonomous Visual Inspection of Space Targets. Leveraging the Space Robotics Bench (SRB), we simulate high-fidelity 6-DOF spacecraft dynamics and train agents using DreamerV3, a state-of-the-art model-based RL algorithm, with PPO and TD3 as model-free baselines. Our investigation focuses on 3D proximity maneuvering tasks around targets such as the Lunar Gateway and other space assets. We evaluate task performance under two complementary regimes: generalized agents trained on randomized velocity vectors, and specialized agents trained to follow fixed trajectories emulating known inspection orbits. Furthermore, we assess the robustness and generalization of policies across multiple spacecraft morphologies and mission domains. Results demonstrate that model-based RL offers promising capabilities in trajectory fidelity, and sample efficiency, paving the way for scalable, retrainable control solutions for future space operations