Goto

Collaborating Authors

 dependability


D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning

Zhang, Jia, Zhang, Chen-Xi, Liu, Yao, Jin, Yi-Xuan, Yang, Xiao-Wen, Zheng, Bo, Liu, Yi, Guo, Lan-Zhe

arXiv.org Artificial Intelligence

Recent advancements in instruction tuning for large language models (LLMs) suggest that a small, high-quality dataset can significantly equip LLMs with instruction-following capabilities, outperforming large datasets often burdened by quality and redundancy issues. However, the challenge lies in automatically identifying valuable subsets from large datasets to boost both the effectiveness and efficiency of instruction tuning. In this paper, we first establish data selection criteria based on three distinct aspects of data value: diversity, difficulty, and dependability, and then propose the D3 method comprising two key steps of scoring and selection. Specifically, in the scoring step, we define the diversity function to measure sample distinctiveness and introduce the uncertainty-based prediction difficulty to evaluate sample difficulty by mitigating the interference of context-oriented generation diversity. Additionally, we integrate an external LLM for dependability assessment. In the selection step, we formulate the D3 weighted coreset objective, which jointly optimizes three aspects of data value to solve for the most valuable subset. The two steps of D3 can iterate multiple rounds, incorporating feedback to refine the selection focus adaptively. Experiments on three datasets demonstrate the effectiveness of D3 in endowing LLMs with competitive or even superior instruction-following capabilities using less than 10% of the entire dataset.


A Robust Federated Learning Framework for Undependable Devices at Scale

Wang, Shilong, Liu, Jianchun, Xu, Hongli, Qiao, Chunming, Deng, Huarong, Zheng, Qiuye, Gong, Jiantao

arXiv.org Artificial Intelligence

In a federated learning (FL) system, many devices, such as smartphones, are often undependable (e.g., frequently disconnected from WiFi) during training. Existing FL frameworks always assume a dependable environment and exclude undependable devices from training, leading to poor model performance and resource wastage. In this paper, we propose FLUDE to effectively deal with undependable environments. First, FLUDE assesses the dependability of devices based on the probability distribution of their historical behaviors (e.g., the likelihood of successfully completing training). Based on this assessment, FLUDE adaptively selects devices with high dependability for training. To mitigate resource wastage during the training phase, FLUDE maintains a model cache on each device, aiming to preserve the latest training state for later use in case local training on an undependable device is interrupted. Moreover, FLUDE proposes a staleness-aware strategy to judiciously distribute the global model to a subset of devices, thus significantly reducing resource wastage while maintaining model performance. We have implemented FLUDE on two physical platforms with 120 smartphones and NVIDIA Jetson devices. Extensive experimental results demonstrate that FLUDE can effectively improve model performance and resource efficiency of FL training in undependable environments.


Power Plays: Unleashing Machine Learning Magic in Smart Grids

Rashid, Abdur, Biswas, Parag, masum, abdullah al, Nasim, MD Abdullah Al, Gupta, Kishor Datta

arXiv.org Artificial Intelligence

The integration of machine learning into smart grid systems represents a transformative step in enhancing the efficiency, reliability, and sustainability of modern energy networks. By adding advanced data analytics, these systems can better manage the complexities of renewable energy integration, demand response, and predictive maintenance. Machine learning algorithms analyze vast amounts of data from smart meters, sensors, and other grid components to optimize energy distribution, forecast demand, and detect irregularities that could indicate potential failures. This enables more precise load balancing, reduces operational costs, and enhances the resilience of the grid against disturbances. Furthermore, the use of predictive models helps in anticipating equipment failures, thereby improving the reliability of the energy supply. As smart grids continue to evolve, the role of machine learning in managing decentralized energy sources and enabling real-time decision-making will become increasingly critical. However, the deployment of these technologies also raises challenges related to data privacy, security, and the need for robust infrastructure. Addressing these issues in this research authors will focus on realizing the full potential of smart grids, ensuring they meet the growing energy demands while maintaining a focus on sustainability and efficiency using Machine Learning techniques. Furthermore, this research will help determine the smart grid's essentiality with the aid of Machine Learning. Multiple ML algorithms have been integrated along with their pros and cons. The future scope of these algorithms are also integrated.


Assurance of AI Systems From a Dependability Perspective

Bloomfield, Robin, Rushby, John

arXiv.org Artificial Intelligence

We outline the principles of classical assurance for computer-based systems that pose significant risks. We then consider application of these principles to systems that employ Artificial Intelligence (AI) and Machine Learning (ML). A key element in this "dependability" perspective is a requirement to have near-complete understanding of the behavior of critical components, and this is considered infeasible for AI and ML. Hence the dependability perspective aims to minimize trust in AI and ML elements by using "defense in depth" with a hierarchy of less complex systems, some of which may be highly assured conventionally engineered components, to "guard" them. This may be contrasted with the "trustworthy" perspective that seeks to apply assurance to the AI and ML elements themselves. In cyber-physical and many other systems, it is difficult to provide guards that do not depend on AI and ML to perceive their environment (e.g., other vehicles sharing the road with a self-driving car), so both perspectives are needed and there is a continuum or spectrum between them. We focus on architectures toward the dependability end of the continuum and invite others to consider additional points along the spectrum. For guards that require perception using AI and ML, we examine ways to minimize the trust placed in these elements; they include diversity, defense in depth, explanations, and micro-ODDs. We also examine methods to enforce acceptable behavior, given a model of the world. These include classical cyber-physical calculations and envelopes, and normative rules based on overarching principles, constitutions, ethics, or reputation. We apply our perspective to autonomous systems, AI systems for specific functions, generic AI such as Large Language Models, and to Artificial General Intelligence (AGI), and we propose current best practice and an agenda for research.


Dependable Distributed Training of Compressed Machine Learning Models

Malandrino, Francesco, Di Giacomo, Giuseppe, Levorato, Marco, Chiasserini, Carla Fabiana

arXiv.org Artificial Intelligence

The existing work on the distributed training of machine learning (ML) models has consistently overlooked the distribution of the achieved learning quality, focusing instead on its average value. This leads to a poor dependability}of the resulting ML models, whose performance may be much worse than expected. We fill this gap by proposing DepL, a framework for dependable learning orchestration, able to make high-quality, efficient decisions on (i) the data to leverage for learning, (ii) the models to use and when to switch among them, and (iii) the clusters of nodes, and the resources thereof, to exploit. For concreteness, we consider as possible available models a full DNN and its compressed versions. Unlike previous studies, DepL guarantees that a target learning quality is reached with a target probability, while keeping the training cost at a minimum. We prove that DepL has constant competitive ratio and polynomial complexity, and show that it outperforms the state-of-the-art by over 27% and closely matches the optimum.


Towards Formal Fault Injection for Safety Assessment of Automated Systems

Farooqui, Ashfaq, Sangchoolie, Behrooz

arXiv.org Artificial Intelligence

Reasoning about safety, security, and other dependability attributes of autonomous systems is a challenge that needs to be addressed before the adoption of such systems in day-to-day life. Formal methods is a class of methods that mathematically reason about a system's behavior. Thus, a correctness proof is sufficient to conclude the system's dependability. However, these methods are usually applied to abstract models of the system, which might not fully represent the actual system. Fault injection, on the other hand, is a testing method to evaluate the dependability of systems. However, the amount of testing required to evaluate the system is rather large and often a problem. This vision paper introduces formal fault injection, a fusion of these two techniques throughout the development lifecycle to enhance the dependability of autonomous systems. We advocate for a more cohesive approach by identifying five areas of mutual support between formal methods and fault injection. By forging stronger ties between the two fields, we pave the way for developing safe and dependable autonomous systems. This paper delves into the integration's potential and outlines future research avenues, addressing open challenges along the way.


Enhancing Cyber-Resilience in Self-Healing Cyber-Physical Systems with Implicit Guarantees

Loh, Randolph, Thing, Vrizlynn L. L.

arXiv.org Artificial Intelligence

Self-Healing Cyber-Physical Systems (SH-CPS) effectively recover from system perceived failures without human intervention. They ensure a level of resilience and tolerance to unforeseen situations that arise from intrinsic system and component degradation, errors, or malicious attacks. Implicit redundancy can be exploited in SH-CPS to structurally adapt without the need to explicitly duplicate components. However, implicitly redundant components do not guarantee the same level of dependability as the primary component used to provide for a given function. Additional processes are needed to restore critical system functionalities as desired. This work introduces implicit guarantees to ensure the dependability of implicitly redundant components and processes. Implicit guarantees can be obtained through inheritance and decomposition. Therefore, a level of dependability can be guaranteed in SH-CPS after adaptation and recovery while complying with requirements. We demonstrate compliance with the requirement guarantees while ensuring resilience in SH-CPS.


System Theoretic View on Uncertainties

Gansch, Roman, Adee, Ahmad

arXiv.org Artificial Intelligence

The complexity of the operating environment and required technologies for highly automated driving is unprecedented. A different type of threat to safe operation besides the fault-error-failure model by Laprie et al. arises in the form of performance limitations. We propose a system theoretic approach to handle these and derive a taxonomy based on uncertainty, i.e. lack of knowledge, as a root cause. Uncertainty is a threat to the dependability of a system, as it limits our ability to assess its dependability properties. We distinguish uncertainties by aleatory (inherent to probabilistic models), epistemic (lack of model parameter knowledge) and ontological (incompleteness of models) in order to determine strategies and methods to cope with them. Analogous to the taxonomy of Laprie et al. we cluster methods into uncertainty prevention (use of elements with well-known behavior, avoiding architectures prone to emergent behavior, restriction of operational design domain, etc.), uncertainty removal (during design time by design of experiment, etc. and after release by field observation, continuous updates, etc.), uncertainty tolerance (use of redundant architectures with diverse uncertainties, uncertainty aware deep learning, etc.) and uncertainty forecasting (estimation of residual uncertainty, etc.).


Towards Quantification of Assurance for Learning-enabled Components

Asaadi, Erfan, Denney, Ewen, Pai, Ganesh

arXiv.org Artificial Intelligence

Perception, localization, planning, and control, high-level functions often organized in a so-called pipeline, are amongst the core building blocks of modern autonomous (ground, air, and underwater) vehicle architectures. These functions are increasingly being implemented using learning-enabled components (LECs), i.e., (software) components leveraging knowledge acquisition and learning processes such as deep learning. Providing quantified component-level assurance as part of a wider (dynamic) assurance case can be useful in supporting both pre-operational approval of LECs (e.g., by regulators), and runtime hazard mitigation, e.g., using assurance-based failover configurations. This paper develops a notion of assurance for LECs based on i) identifying the relevant dependability attributes, and ii) quantifying those attributes and the associated uncertainty, using probabilistic techniques. We give a practical grounding for our work using an example from the aviation domain: an autonomous taxiing capability for an unmanned aircraft system (UAS), focusing on the application of LECs as sensors in the perception function. We identify the applicable quantitative measures of assurance, and characterize the associated uncertainty using a non-parametric Bayesian approach, namely Gaussian process regression. We additionally discuss the relevance and contribution of LEC assurance to system-level assurance, the generalizability of our approach, and the associated challenges.


A Novel Multi-Layer Framework for BVLoS Drone Operation: A Preliminary Study

Sorbelli, Francesco Betti, Chatterjee, Punyasha, Corò, Federico, Palazzetti, Lorenzo, Pinotti, Cristina M.

arXiv.org Artificial Intelligence

Drones have become increasingly popular in a variety of fields, including agriculture, emergency response, and package delivery. However, most drone operations are currently limited to within Visual Line of Sight (vlos) due to safety concerns. Flying drones Beyond Visual Line of Sight (bvlos) presents new challenges and opportunities, but also requires new technologies and regulatory frameworks, not yet implemented, to ensure that the drone is constantly under the control of a remote operator. In this preliminary study, we assume to remotely control the drone using the available ground cellular network infrastructure. We propose to plan bvlos drone operations using a novel multi-layer framework that includes many layers of constraints that closely resemble real-world scenarios and challenges. These layers include information such as the potential ground risk in the event of a drone failure, the available ground cellular network infrastructure, and the presence of ground obstacles. From the multi-layer framework, a graph is constructed whose edges are weighted with a dependability score that takes into account the information of the multi-layer framework. Then, the planning of bvlos drone missions is equivalent to solving the Maximum Path Dependability Problem on the constructed graph, which turns out to be solvable by applying Dijkstra's algorithm.