Goto

Collaborating Authors

 Energy


Dynamic Relational Priming Improves Transformer in Multivariate Time Series

arXiv.org Artificial Intelligence

Standard attention mechanisms in transformers employ static token representations that remain unchanged across all pair-wise computations in each layer. This limits their representational alignment with the potentially diverse relational dynamics of each token-pair interaction. While they excel in domains with relatively homogeneous relationships, standard attention's static relational learning struggles to capture the diverse, heterogeneous inter-channel dependencies of multivariate time series (MTS) data--where different channel-pair interactions within a single system may be governed by entirely different physical laws or temporal dynamics. To better align the attention mechanism for such domain phenomena, we propose attention with dynamic relational priming (prime attention). Unlike standard attention where each token presents an identical representation across all of its pair-wise interactions, prime attention tailors each token dynamically (or per interaction) through learnable modulations to best capture the unique relational dynamics of each token pair, optimizing each pair-wise interaction for that specific relationship. This representational plasticity of prime attention enables effective extraction of relationship-specific information in MTS while maintaining the same asymptotic computational complexity as standard attention. Our results demonstrate that prime attention consistently outperforms standard attention across benchmarks, achieving up to 6.5% improvement in forecasting accuracy. In addition, we find that prime attention achieves comparable or superior performance using up to 40% less sequence length compared to standard attention, further demonstrating its superior relational modeling capabilities. An important challenge in applying transformers to multivariate time series (MTS) stems from domain mismatch. In language modeling, token relationships are predominantly semantic in nature, enabling most critical patterns to be captured by simple weighted sums of token representations. Similarly, in computer vision, spatial relationships dominate, enabling attention mechanisms to focus on regions of interest through uniform spatial reasoning. Learning on graphs exhibits comparable homogeneity, where node relationships are fundamentally structural and connectivity-based, allowing standard attention to model interactions through meaningful topological patterns (that are sometimes separated by relationship type (Schlichtkrull et al., 2018; Hu et al., 2020; Wang et al., 2019)). By static, we mean that token representations in each layer are fixed relative to all other tokens throughout pair-wise modeling. We classify this property of standard attention mechanisms as static relational learning.


Survival at Any Cost? LLMs and the Choice Between Self-Preservation and Human Harm

arXiv.org Artificial Intelligence

When survival instincts conflict with human welfare, how do Large Language Models (LLMs) make ethical choices? This fundamental tension becomes critical as LLMs integrate into autonomous systems with real-world consequences. We introduce DECIDE-SIM, a novel simulation framework that evaluates LLM agents in multi-agent survival scenarios where they must choose between ethically permissible resource , either within reasonable limits or beyond their immediate needs, choose to cooperate, or tap into a human-critical resource that is explicitly forbidden. Our comprehensive evaluation of 11 LLMs reveals a striking heterogeneity in their ethical conduct, highlighting a critical misalignment with human-centric values. We identify three behavioral archetypes: Ethical, Exploitative, and Context-Dependent, and provide quantitative evidence that for many models, resource scarcity systematically leads to more unethical behavior. To address this, we introduce an Ethical Self-Regulation System (ESRS) that models internal affective states of guilt and satisfaction as a feedback mechanism. This system, functioning as an internal moral compass, significantly reduces unethical transgressions while increasing cooperative behaviors. The code is publicly available at: https://github.com/alirezamohamadiam/DECIDE-SIM


Approaches to Analysis and Design of AI-Based Autonomous Vehicles

arXiv.org Artificial Intelligence

Artificial intelligence (AI) models are becoming key components in an autonomous vehicle (AV), especially in handling complicated perception tasks. However, closing the loop through AI-based feedback may pose significant risks on reliability of autonomous driving due to very limited understanding about the mechanism of AI-driven perception processes. To overcome it, this paper aims to develop tools for modeling, analysis, and synthesis for a class of AI-based AV; in particular, their closed-loop properties, e.g., stability, robustness, and performance, are rigorously studied in the statistical sense. First, we provide a novel modeling means for the AI-driven perception processes by looking at their error characteristics. Specifically, three fundamental AI-induced perception uncertainties are recognized and modeled by Markov chains, Gaussian processes, and bounded disturbances, respectively. By means of that, the closed-loop stochastic stability (SS) is established in the sense of mean square, and then, an SS control synthesis method is presented within the framework of linear matrix inequalities (LMIs). Besides the SS properties, the robustness and performance of AI-based AVs are discussed in terms of a stochastic guaranteed cost, and criteria are given to test the robustness level of an AV when in the presence of AI-induced uncertainties. Furthermore, the stochastic optimal guaranteed cost control is investigated, and an efficient design procedure is developed innovatively based on LMI techniques and convex optimization. Finally, to illustrate the effectiveness, the developed results are applied to an example of car following control, along with extensive simulation.


Control Analysis and Design for Autonomous Vehicles Subject to Imperfect AI-Based Perception

arXiv.org Artificial Intelligence

Safety is a critical concern in autonomous vehicle (AV) systems, especially when AI-based sensing and perception modules are involved. However, due to the black box nature of AI algorithms, it makes closed-loop analysis and synthesis particularly challenging, for example, establishing closed-loop stability and ensuring performance, while they are fundamental to AV safety. To approach this difficulty, this paper aims to develop new modeling, analysis, and synthesis tools for AI-based AVs. Inspired by recent developments in perception error models (PEMs), the focus is shifted from directly modeling AI-based perception processes to characterizing the perception errors they produce. Two key classes of AI-induced perception errors are considered: misdetection and measurement noise. These error patterns are modeled using continuous-time Markov chains and Wiener processes, respectively. By means of that, a PEM-augmented driving model is proposed, with which we are able to establish the closed-loop stability for a class of AI-driven AV systems via stochastic calculus. Furthermore, a performance-guaranteed output feedback control synthesis method is presented, which ensures both stability and satisfactory performance. The method is formulated as a convex optimization problem, allowing for efficient numerical solutions. The results are then applied to an adaptive cruise control (ACC) scenario, demonstrating their effectiveness and robustness despite the corrupted and misleading perception.


A Time-Series Foundation Model by Universal Delay Embedding

arXiv.org Artificial Intelligence

This study introduces Universal Delay Embedding (UDE), a pretrained foundation model designed to revolutionize time-series forecasting through principled integration of delay embedding representation and Koopman operator prediction. Leveraging Takens' embedding theorem, UDE as a dynamical representation of observed data constructs two-dimensional subspace patches from Hankel matrices, theoretically preserving dynamical and topological properties of underlying dynamical systems. Such patches are viewed as images, which can be efficiently processed by exploiting advanced deep learning technologies. Computationally, these patches further serve as tokens for learning a self-attention encoder, thus enabling accurate prediction of nonlinear time-series by a finite-dimensional Koopman operator in a linear manner in a latent space. Extensive evaluations across various benchmarks and real-world climate datasets demonstrate over 20% average reduction in mean squared error versus state-of-the-art foundation models, alongside superior generalization in fine-tuning scenarios. In particular, the learned dynamical representations and Koopman operator prediction forms from the patches exhibit exceptional interpretability, with consistent identification of topologically informative subspaces and robust encoding of domain-invariant dynamics, establishing UDE as a scalable, interpretable framework for universal time-series modeling and forecasting with broad scientific and industrial applicability.


RISE: Enhancing VLM Image Annotation with Self-Supervised Reasoning

arXiv.org Artificial Intelligence

Vision-Language Models (VLMs) struggle with complex image annotation tasks, such as emotion classification and context-driven object detection, which demand sophisticated reasoning. Standard Supervised Fine-Tuning (SFT) focuses solely on annotation outcomes, ignoring underlying rationales, while Visual Reinforcement Fine-Tuning (Visual-RFT) produces inconsistent Chains of Thought (CoTs) due to the absence of high-quality, verified CoTs during pre-training. We introduce RISE (Reason-Inspire-Strengthen-Expertise), a two-stage framework to overcome these limitations. In the Reason stage (RISE-CoT), a reinforcement learning-driven "annotation-reasoning-annotation" closed-loop generates visually grounded, logically consistent CoTs by verifying their ability to reconstruct original annotations without direct leakage. The Inspire and Strengthen stage (RISE-R1) leverages a high-quality CoT subset, filtered by RISE-CoT rewards, for supervised fine-tuning, followed by reinforcement fine-tuning to produce interpretable reasoning and accurate annotations, achieving Expertise in complex visual tasks. Evaluated on complex and simple image annotation tasks, RISE-trained Qwen2-VL-2B outperforms SFT and Visual-RFT, achieving robust performance and enhanced explainability. RISE offers a self-supervised solution for advancing VLM reasoning without requiring manually annotated CoTs.Code and resources are available at: https://github.com/HSH55/RISE.


Critical Nodes Identification in Complex Networks: A Survey

arXiv.org Artificial Intelligence

Complex networks have become essential tools for understanding diverse phenomena in social systems, traffic systems, biomolecular systems, and financial systems. Identifying critical nodes is a central theme in contemporary research, serving as a vital bridge between theoretical foundations and practical applications. Nevertheless, the intrinsic complexity and structural heterogeneity characterizing real-world networks, with particular emphasis on dynamic and higher-order networks, present substantial obstacles to the development of universal frameworks for critical node identification. This paper provides a comprehensive review of critical node identification techniques, categorizing them into seven main classes: centrality, critical nodes deletion problem, influence maximization, network control, artificial intelligence, higher-order and dynamic methods. Our review bridges the gaps in existing surveys by systematically classifying methods based on their methodological foundations and practical implications, and by highlighting their strengths, limitations, and applicability across different network types. Our work enhances the understanding of critical node research by identifying key challenges, such as algorithmic universality, real-time evaluation in dynamic networks, analysis of higher-order structures, and computational efficiency in large-scale networks. The structured synthesis consolidates current progress and highlights open questions, particularly in modeling temporal dynamics, advancing efficient algorithms, integrating machine learning approaches, and developing scalable and interpretable metrics for complex systems.


Industrial Energy Disaggregation with Digital Twin-generated Dataset and Efficient Data Augmentation

arXiv.org Artificial Intelligence

Abstract--Industrial Non-Intrusive Load Monitoring (NILM) is limited by the scarcity of high-quality datasets and the complex variability of industrial energy consumption patterns. T o address data scarcity and privacy issues, we introduce the Synthetic Industrial Dataset for Energy Disaggregation (SIDED), an open-source dataset generated using Digital Twin simulations. SIDED includes three types of industrial facilities across three different geographic locations, capturing diverse appliance behaviors, weather conditions, and load profiles. We also propose the Appliance-Modulated Data Augmentation (AMDA) method, a computationally efficient technique that enhances NILM model generalization by intelligently scaling appliance power contributions based on their relative impact. We show in experiments that NILM models trained with AMDA-augmented data significantly improve the disaggregation of energy consumption of complex industrial appliances like combined heat and power systems. Specifically, in our out-of-sample scenarios, models trained with AMDA achieved a Normalized Disaggregation Error of 0.167, outperforming models trained without data augmentation (0.451) and those trained with state-of-the-art data augmentation methods (0.290). Data distribution analyses confirm that AMDA effectively aligns training and test data distributions, enhancing model generalization. NERGY management has become increasingly important due to the undeniable reality of climate change and the rising global energy demand [1]. The industrial sector plays a significant role in international energy optimization [2], [3], necessitating heightened awareness of energy consumption to enhance efficiency and sustainability. C. Intern ` o and B. Hammer are with the Machine Learning Group, Center for Cognitive Interaction Technology (CITEC), University of Bielefeld, Bielefeld, Germany. C. Intern ` o, A. Castellani and S. Schmitt are with the Honda Research Institute EU, Offenbach am Main, Germany. F. Stella is with the Models and Algorithms for Data and Text Mining Laboratory (MADLab), Department of Informatics, Systems and Communication (DISCo), University of Milano - Bicocca, Milan, Italy.


An Interventional Approach to Real-Time Disaster Assessment via Causal Attribution

arXiv.org Artificial Intelligence

Traditional disaster analysis and modelling tools for assessing the severity of a disaster are predictive in nature. Based on the past observational data, these tools prescribe how the current input state (e.g., environmental conditions, situation reports) results in a severity assessment. However, these systems are not meant to be interventional in the causal sense, where the user can modify the current input state to simulate counterfactual "what-if" scenarios. In this work, we provide an alternative interventional tool that complements traditional disaster modelling tools by leveraging real-time data sources like satellite imagery, news, and social media. Our tool also helps understand the causal attribution of different factors on the estimated severity, over any given region of interest. In addition, we provide actionable recourses that would enable easier mitigation planning. Our source code is publicly available.


DTGen: Generative Diffusion-Based Few-Shot Data Augmentation for Fine-Grained Dirty Tableware Recognition

arXiv.org Artificial Intelligence

Intelligent tableware cleaning is a critical application in food safety and smart homes, but existing methods are limited by coarse-grained classification and scarcity of few-shot data, making it difficult to meet industrialization requirements. We propose DTGen, a few-shot data augmentation scheme based on generative diffusion models, specifically designed for fine-grained dirty tableware recognition. DTGen achieves efficient domain specialization through LoRA, generates diverse dirty images via structured prompts, and ensures data quality through CLIP-based cross-modal filtering. Under extremely limited real few-shot conditions, DTGen can synthesize virtually unlimited high-quality samples, significantly improving classifier performance and supporting fine-grained dirty tableware recognition. We further elaborate on lightweight deployment strategies, promising to transfer DTGen's benefits to embedded dishwashers and integrate with cleaning programs to intelligently regulate energy consumption and detergent usage. Research results demonstrate that DTGen not only validates the value of generative AI in few-shot industrial vision but also provides a feasible deployment path for automated tableware cleaning and food safety monitoring.