Ghidini, Chiara
Generating Counterfactual Explanations Under Temporal Constraints
Buliga, Andrei, Di Francescomarino, Chiara, Ghidini, Chiara, Montali, Marco, Ronzani, Massimiliano
Counterfactual explanations are one of the prominent eXplainable Artificial Intelligence (XAI) techniques, and suggest changes to input data that could alter predictions, leading to more favourable outcomes. Existing counterfactual methods do not readily apply to temporal domains, such as that of process mining, where data take the form of traces of activities that must obey to temporal background knowledge expressing which dynamics are possible and which not. Specifically, counterfactuals generated off-the-shelf may violate the background knowledge, leading to inconsistent explanations. This work tackles this challenge by introducing a novel approach for generating temporally constrained counterfactuals, guaranteed to comply by design with background knowledge expressed in Linear Temporal Logic on process traces (LTLp). We do so by infusing automata-theoretic techniques for LTLp inside a genetic algorithm for counterfactual generation. The empirical evaluation shows that the generated counterfactuals are temporally meaningful and more interpretable for applications involving temporal dependencies.
Generating the Traces You Need: A Conditional Generative Model for Process Mining Data
Graziosi, Riccardo, Ronzani, Massimiliano, Buliga, Andrei, Di Francescomarino, Chiara, Folino, Francesco, Ghidini, Chiara, Meneghello, Francesca, Pontieri, Luigi
In recent years, trace generation has emerged as a significant challenge within the Process Mining community. Deep Learning (DL) models have demonstrated accuracy in reproducing the features of the selected processes. However, current DL generative models are limited in their ability to adapt the learned distributions to generate data samples based on specific conditions or attributes. This limitation is particularly significant because the ability to control the type of generated data can be beneficial in various contexts, enabling a focus on specific behaviours, exploration of infrequent patterns, or simulation of alternative 'what-if' scenarios. In this work, we address this challenge by introducing a conditional model for process data generation based on a conditional variational autoencoder (CVAE). Conditional models offer control over the generation process by tuning input conditional variables, enabling more targeted and controlled data generation. Unlike other domains, CVAE for process mining faces specific challenges due to the multiperspective nature of the data and the need to adhere to control-flow rules while ensuring data variability. Specifically, we focus on generating process executions conditioned on control flow and temporal features of the trace, allowing us to produce traces for specific, identified sub-processes. The generated traces are then evaluated using common metrics for generative model assessment, along with additional metrics to evaluate the quality of the conditional generation
Guiding the generation of counterfactual explanations through temporal background knowledge for Predictive Process Monitoring
Buliga, Andrei, Di Francescomarino, Chiara, Ghidini, Chiara, Donadello, Ivan, Maggi, Fabrizio Maria
Counterfactual explanations suggest what should be different in the input instance to change the outcome of an AI system. When dealing with counterfactual explanations in the field of Predictive Process Monitoring, however, control flow relationships among events have to be carefully considered. A counterfactual, indeed, should not violate control flow relationships among activities (temporal background knowledege). Within the field of Explainability in Predictive Process Monitoring, there have been a series of works regarding counterfactual explanations for outcome-based predictions. However, none of them consider the inclusion of temporal background knowledge when generating these counterfactuals. In this work, we adapt state-of-the-art techniques for counterfactual generation in the domain of XAI that are based on genetic algorithms to consider a series of temporal constraints at runtime. We assume that this temporal background knowledge is given, and we adapt the fitness function, as well as the crossover and mutation operators, to maintain the satisfaction of the constraints. The proposed methods are evaluated with respect to state-of-the-art genetic algorithms for counterfactual generation and the results are presented. We showcase that the inclusion of temporal background knowledge allows the generation of counterfactuals more conformant to the temporal background knowledge, without however losing in terms of the counterfactual traditional quality metrics.
Incremental Predictive Process Monitoring: How to Deal with the Variability of Real Environments
Di Francescomarino, Chiara, Ghidini, Chiara, Maggi, Fabrizio Maria, Rizzi, Williams, Persia, Cosimo Damiano
A characteristic of existing predictive process monitoring techniques is to first construct a predictive model based on past process executions, and then use it to predict the future of new ongoing cases, without the possibility of updating it with new cases when they complete their execution. This can make predictive process monitoring too rigid to deal with the variability of processes working in real environments that continuously evolve and/or exhibit new variant behaviors over time. As a solution to this problem, we propose the use of algorithms that allow the incremental construction of the predictive model. These incremental learning algorithms update the model whenever new cases become available so that the predictive model evolves over time to fit the current circumstances. The algorithms have been implemented using different case encoding strategies and evaluated on a number of real and synthetic datasets. The results provide a first evidence of the potential of incremental learning strategies for predicting process monitoring in real environments, and of the impact of different case encoding strategies in this setting.
Explain, Adapt and Retrain: How to improve the accuracy of a PPM classifier through different explanation styles
Rizzi, Williams, Di Francescomarino, Chiara, Ghidini, Chiara, Maggi, Fabrizio Maria
Recent papers have introduced a novel approach to explain why a Predictive Process Monitoring (PPM) model for outcome-oriented predictions provides wrong predictions. Moreover, they have shown how to exploit the explanations, obtained using state-of-the art post-hoc explainers, to identify the most common features that induce a predictor to make mistakes in a semi-automated way, and, in turn, to reduce the impact of those features and increase the accuracy of the predictive model. This work starts from the assumption that frequent control flow patterns in event logs may represent important features that characterize, and therefore explain, a certain prediction. Therefore, in this paper, we (i) employ a novel encoding able to leverage DECLARE constraints in Predictive Process Monitoring and compare the effectiveness of this encoding with Predictive Process Monitoring state-of-the art encodings, in particular for the task of outcome-oriented predictions; (ii) introduce a completely automated pipeline for the identification of the most common features inducing a predictor to make mistakes; and (iii) show the effectiveness of the proposed pipeline in increasing the accuracy of the predictive model by validating it on different real-life datasets.
Recommending the optimal policy by learning to act from temporal data
Branchi, Stefano, Buliga, Andrei, Di Francescomarino, Chiara, Ghidini, Chiara, Meneghello, Francesca, Ronzani, Massimiliano
Prescriptive Process Monitoring is a prominent problem in Process Mining, which consists in identifying a set of actions to be recommended with the goal of optimising a target measure of interest or Key Performance Indicator (KPI). One challenge that makes this problem difficult is the need to provide Prescriptive Process Monitoring techniques only based on temporally annotated (process) execution data, stored in, so-called execution logs, due to the lack of well crafted and human validated explicit models. In this paper we aim at proposing an AI based approach that learns, by means of Reinforcement Learning (RL), an optimal policy (almost) only from the observation of past executions and recommends the best activities to carry on for optimizing a KPI of interest. This is achieved first by learning a Markov Decision Process for the specific KPIs from data, and then by using RL training to learn the optimal policy. The approach is validated on real and synthetic datasets and compared with off-policy Deep RL approaches. The ability of our approach to compare with, and often overcome, Deep RL approaches provides a contribution towards the exploitation of white box RL techniques in scenarios where only temporal execution data are available.
Exploring Business Process Deviance with Sequential and Declarative Patterns
Bergami, Giacomo, Di Francescomarino, Chiara, Ghidini, Chiara, Maggi, Fabrizio Maria, Puura, Joonas
Business process deviance refers to the phenomenon whereby a subset of the executions of a business process deviate, in a negative or positive way, with respect to {their} expected or desirable outcomes. Deviant executions of a business process include those that violate compliance rules, or executions that undershoot or exceed performance targets. Deviance mining is concerned with uncovering the reasons for deviant executions by analyzing event logs stored by the systems supporting the execution of a business process. In this paper, the problem of explaining deviations in business processes is first investigated by using features based on sequential and declarative patterns, and a combination of them. Then, the explanations are further improved by leveraging the data attributes of events and traces in event logs through features based on pure data attribute values and data-aware declarative rules. The explanations characterizing the deviances are then extracted by direct and indirect methods for rule induction. Using real-life logs from multiple domains, a range of feature types and different forms of decision rules are evaluated in terms of their ability to accurately discriminate between non-deviant and deviant executions of a process as well as in terms of understandability of the final outcome returned to the users.
Process Extraction from Text: state of the art and challenges for the future
Bellan, Patrizio, Dragoni, Mauro, Ghidini, Chiara
Automatic Process Discovery aims at developing algorithmic methodologies for the extraction and elicitation of process models as described in data. While Process Discovery from event-log data is a well established area, that has already moved from research to concrete adoption in a mature manner, Process Discovery from text is still a research area at an early stage of development, which rarely scales to real world documents. In this paper we analyze, in a comparative manner, reference state-of-the-art literature, especially for what concerns the techniques used, the process elements extracted and the evaluations performed. As a result of the analysis we discuss important limitations that hamper the exploitation of recent Natural Language Processing techniques in this field and we discuss fundamental limitations and challenges for the future concerning the datasets, the techniques, the experimental evaluations, and the pipelines currently adopted and to be developed in the future.
How do I update my model? On the resilience of Predictive Process Monitoring models to change
Rizzi1, Williams, Di Francescomarino, Chiara, Ghidini, Chiara, Maggi, Fabrizio Maria
Existing well investigated Predictive Process Monitoring techniques typically construct a predictive model based on past process executions, and then use it to predict the future of new ongoing cases, without the possibility of updating it with new cases when they complete their execution. This can make Predictive Process Monitoring too rigid to deal with the variability of processes working in real environments that continuously evolve and/or exhibit new variant behaviours over time. As a solution to this problem, we evaluate the use of three different strategies that allow the periodic rediscovery or incremental construction of the predictive model so as to exploit new available data. The evaluation focuses on the performance of the new learned predictive models, in terms of accuracy and time, against the original one, and uses a number of real and synthetic datasets with and without explicit Concept Drift. The results provide an evidence of the potential of incremental learning algorithms for predicting process monitoring in real environments.
Verification of data-aware workflows via reachability: formalisation and experiments
De Masellis, Riccardo, Di Francescomarino, Chiara, Ghidini, Chiara, Tessaris, Sergio
The growing adoption of IT-systems for the modelling and execution of (business) processes or services has thrust the scientific investigation towards techniques and tools which support complex forms of process analysis. These techniques rely on observation of past (tracked and logged) process executions but typically: (i) only consider activities, lacking the ability to take into account the data objects manipulated by these activities and (ii) assume complete observations of terminated process executions. In many real cases, however, only incomplete log information is available. This paper tackles these two shortcomings by proposing an approach to exploit reachability to reason on imperative data-aware process models and possibly incomplete process executions. The contribution of this paper is twofold: first, it formulates the trace completion as a reachability problem over data-aware models and second, it provides a rigorous mapping between our data-aware models and three important paradigms for reasoning about dynamic systems, namely Action Languages, Classical Planning, and Model-Checking. This allows us to exploit and extensively evaluate the available tools for the above paradigms to solve the trace repair problem. The rigorous encoding of our data-aware models, based on a common interpretation of the semantics of Action Languages, Classical Planning, and Model-Checking in terms of transition systems, paired with a first comprehensive assessment of the performances of their tools in computing reachability for data-aware workflow net languages, provide a solid contribution to advancing the state-of-the-art on the concrete exploitation of formal verification techniques on business processes.