Goto

Collaborating Authors

 dtmc


Bounded PCTL Model Checking of Large Language Model Outputs

arXiv.org Artificial Intelligence

In this paper, we introduce LLMCHECKER, a model-checking-based verification method to verify the probabilistic computation tree logic (PCTL) properties of an LLM text generation process. We empirically show that only a limited number of tokens are typically chosen during text generation, which are not always the same. This insight drives the creation of $ฮฑ$-$k$-bounded text generation, narrowing the focus to the $ฮฑ$ maximal cumulative probability on the top-$k$ tokens at every step of the text generation process. Our verification method considers an initial string and the subsequent top-$k$ tokens while accommodating diverse text quantification methods, such as evaluating text quality and biases. The threshold $ฮฑ$ further reduces the selected tokens, only choosing those that exceed or meet it in cumulative probability. LLMCHECKER then allows us to formally verify the PCTL properties of $ฮฑ$-$k$-bounded LLMs. We demonstrate the applicability of our method in several LLMs, including Llama, Gemma, Mistral, Genstruct, and BERT. To our knowledge, this is the first time PCTL-based model checking has been used to check the consistency of the LLM text generation process.


Learning Probabilistic Temporal Logic Specifications for Stochastic Systems

arXiv.org Artificial Intelligence

There has been substantial progress in the inference of formal behavioural specifications from sample trajectories, for example using Linear Temporal Logic (L TL). However, these techniques cannot handle specifications that correctly characterise systems with stochastic behaviour, which occur commonly in reinforcement learning and formal verification. We consider the passive learning problem of inferring a Boolean combination of probabilistic L TL (PL TL) formulas from a set of Markov chains, classified as either positive or negative. We propose a novel learning algorithm that infers concise PL TL specifications, leveraging grammar-based enumeration, search heuristics, probabilistic model checking and Boolean set-cover procedures. We demonstrate the effectiveness of our algorithm in two use cases: learning from policies induced by RL algorithms and learning from variants of a probabilistic model. In both cases, our method automatically and efficiently extracts PL TL specifications that succinctly characterize the temporal differences between the policies or model variants.


Turn-based Multi-Agent Reinforcement Learning Model Checking

arXiv.org Artificial Intelligence

In this paper, we propose a novel approach for verifying the compliance of turn-based multi-agent reinforcement learning (TMARL) agents with complex requirements in stochastic multiplayer games. Our method overcomes the limitations of existing verification approaches, which are inadequate for dealing with TMARL agents and not scalable to large games with multiple agents. Our approach relies on tight integration of TMARL and a verification technique referred to as model checking. We demonstrate the effectiveness and scalability of our technique through experiments in different types of environments. Our experiments show that our method is suited to verify TMARL agents and scales better than naive monolithic model checking.


TIPS: Topologically Important Path Sampling for Anytime Neural Networks

arXiv.org Artificial Intelligence

Anytime neural networks (AnytimeNNs) are a promising solution to adaptively adjust the model complexity at runtime under various hardware resource constraints. However, the manually-designed AnytimeNNs are biased by designers' prior experience and thus provide sub-optimal solutions. To address the limitations of existing hand-crafted approaches, we first model the training process of AnytimeNNs as a discrete-time Markov chain (DTMC) and use it to identify the paths that contribute the most to the training of AnytimeNNs. Based on this new DTMC-based analysis, we further propose TIPS, a framework to automatically design AnytimeNNs under various hardware constraints. Our experimental results show that TIPS can improve the convergence rate and test accuracy of AnytimeNNs. Compared to the existing AnytimeNNs approaches, TIPS improves the accuracy by 2%-6.6% on multiple datasets and achieves SOTA accuracy-FLOPs tradeoffs.


Adversarial Robustness Verification and Attack Synthesis in Stochastic Systems

arXiv.org Artificial Intelligence

Probabilistic model checking is a useful technique for specifying and verifying properties of stochastic systems including randomized protocols and reinforcement learning models. Existing methods rely on the assumed structure and probabilities of certain system transitions. These assumptions may be incorrect, and may even be violated by an adversary who gains control of system components. In this paper, we develop a formal framework for adversarial robustness in systems modeled as discrete time Markov chains (DTMCs). We base our framework on existing methods for verifying probabilistic temporal logic properties and extend it to include deterministic, memoryless policies acting in Markov decision processes (MDPs). Our framework includes a flexible approach for specifying structure-preserving and non structure-preserving adversarial models. We outline a class of threat models under which adversaries can perturb system transitions, constrained by an $\varepsilon$ ball around the original transition probabilities. We define three main DTMC adversarial robustness problems: adversarial robustness verification, maximal $\delta$ synthesis, and worst case attack synthesis. We present two optimization-based solutions to these three problems, leveraging traditional and parametric probabilistic model checking techniques. We then evaluate our solutions on two stochastic protocols and a collection of Grid World case studies, which model an agent acting in an environment described as an MDP. We find that the parametric solution results in fast computation for small parameter spaces. In the case of less restrictive (stronger) adversaries, the number of parameters increases, and directly computing property satisfaction probabilities is more scalable. We demonstrate the usefulness of our definitions and solutions by comparing system outcomes over various properties, threat models, and case studies.


Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems

arXiv.org Artificial Intelligence

While Deep Reinforcement Learning (DRL) provides transformational capabilities to the control of Robotics and Autonomous Systems (RAS), the black-box nature of DRL and uncertain deployment-environments of RAS pose new challenges on its dependability. Although there are many existing works imposing constraints on the DRL policy to ensure a successful completion of the mission, it is far from adequate in terms of assessing the DRL-driven RAS in a holistic way considering all dependability properties. In this paper, we formally define a set of dependability properties in temporal logic and construct a Discrete-Time Markov Chain (DTMC) to model the dynamics of risk/failures of a DRL-driven RAS interacting with the stochastic environment. We then do Probabilistic Model Checking based on the designed DTMC to verify those properties. Our experimental results show that the proposed method is effective as a holistic assessment framework, while uncovers conflicts between the properties that may need trade-offs in the training. Moreover, we find the standard DRL training cannot improve dependability properties, thus requiring bespoke optimisation objectives concerning them. Finally, our method offers a novel dependability analysis to the Sim-to-Real challenge of DRL.


Probabilistic Verification of Neural Networks Against Group Fairness

arXiv.org Artificial Intelligence

Fairness is crucial for neural networks which are used in applications with important societal implication. Recently, there have been multiple attempts on improving fairness of neural networks, with a focus on fairness testing (e.g., generating individual discriminatory instances) and fairness training (e.g., enhancing fairness through augmented training). In this work, we propose an approach to formally verify neural networks against fairness, with a focus on independence-based fairness such as group fairness. Our method is built upon an approach for learning Markov Chains from a user-provided neural network (i.e., a feed-forward neural network or a recurrent neural network) which is guaranteed to facilitate sound analysis. The learned Markov Chain not only allows us to verify (with Probably Approximate Correctness guarantee) whether the neural network is fair or not, but also facilities sensitivity analysis which helps to understand why fairness is violated. We demonstrate that with our analysis results, the neural weights can be optimized to improve fairness. Our approach has been evaluated with multiple models trained on benchmark datasets and the experiment results show that our approach is effective and efficient.


Viterbi Extraction tutorial with Hidden Markov Toolkit

arXiv.org Artificial Intelligence

An algorithm used to extract HMM parameters is revisited. Most parts of the extraction process are taken from implemented Hidden Markov Toolkit (HTK) program under name HInit. The algorithm itself shows a few variations compared to another domain of implementations. The HMM model is introduced briefly based on the theory of Discrete Time Markov Chain. We schematically outline the Viterbi method implemented in HTK. Iterative definition of the method which is ready to be implemented in computer programs is reviewed. We also illustrate the method calculation precisely using manual calculation and extensive graphical illustration. The distribution of observation probability used is simply independent Gaussians r.v.s. The purpose of the content is not to justify the performance or accuracy of the method applied in a specific area. This writing merely to describe how the algorithm is performed. The whole content should enlighten the audience the insight of the Viterbi Extraction method used by HTK.


Probabilistic Model Checking of Robots Deployed in Extreme Environments

arXiv.org Artificial Intelligence

Robots are increasingly used to carry out critical missions in extreme environments that are hazardous for humans. This requires a high degree of operational autonomy under uncertain conditions, and poses new challenges for assuring the robot's safety and reliability. In this paper, we develop a framework for probabilistic model checking on a layered Markov model to verify the safety and reliability requirements of such robots, both at pre-mission stage and during runtime. Two novel estimators based on conservative Bayesian inference and imprecise probability model with sets of priors are introduced to learn the unknown transition parameters from operational data. We demonstrate our approach using data from a real-world deployment of unmanned underwater vehicles in extreme environments.


Trusted Machine Learning: Model Repair and Data Repair for Probabilistic Models

AAAI Conferences

When machine learning algorithms are used in life-critical or mission-critical applications (e.g., self driving cars, cyber security, surgical robotics), it is important to ensure that they provide some high-level correctness guarantees. We introduce a paradigm called Trusted Machine Learning (TML) with the goal of making learning techniques more trustworthy. We outline methods that show how symbolic analysis (specifi- cally parametric model checking) can be used to learn the dynamical model of a system where the learned model satis- fies correctness requirements specified in the form of temporal logic properties (e.g., safety, liveness). When a learned model does not satisfy the desired guarantees, we try two approaches: (1) Model Repair, wherein we modify a learned model directly, and (2) Data Repair, wherein we modify the data so that re-learning from the modified data will result in a trusted model. Model Repair tries to make the minimal changes to the trained model while satisfying the properties, whereas Data Repair tries to make the minimal changes to the dataset used to train the model for ensuring satisfaction of the properties. We show how the Model Repair and Data Repair problems can be solved for the case of probabilistic models, specifically Discrete-Time Markov Chains (DTMC) or Markov Decision Processes (MDP), when the desired properties are expressed in Probabilistic Computation Tree Logic (PCTL). Specifically, we outline how the parameter learning problem in the probabilistic Markov models under temporal logic constraints can be equivalently expressed as a non-linear optimization with non-linear rational constraints, by performing symbolic transformations using a parametric model checker. We illustrate the approach on two case studies: a controller for automobile lane changing, and query router for a wireless sensor network.