Goto

Collaborating Authors

 conformance checking


Run-Time Monitoring of ERTMS/ETCS Control Flow by Process Mining

Vitale, Francesco, Zoppi, Tommaso, Flammini, Francesco, Mazzocca, Nicola

arXiv.org Artificial Intelligence

Ensuring the resilience of computer-based railways is increasingly crucial to account for uncertainties and changes due to the growing complexity and criticality of these systems. Although their software relies on strict verification and validation processes following well-established best-practices and certification standards, anomalies can still occur at run-time due to residual faults, system and environmental modifications that were unknown at design-time, or other emergent cyber-threat scenarios. This paper explores run-time control-flow anomaly detection using process mining to enhance the resilience of ERTMS/ETCS L2 (European Rail Traffic Management System / European Train Control System Level 2). Process mining allows learning the actual control flow of the system from its execution traces, thus enabling run-time monitoring through online conformance checking. In addition, anomaly localization is performed through unsupervised machine learning to link relevant deviations to critical system components. We test our approach on a reference ERTMS/ETCS L2 scenario, namely the RBC/RBC Handover, to show its capability to detect and localize anomalies with high accuracy, efficiency, and explainability.


Architecting software monitors for control-flow anomaly detection through large language models and conformance checking

Vitale, Francesco, Flammini, Francesco, Caporuscio, Mauro, Mazzocca, Nicola

arXiv.org Artificial Intelligence

Context: Ensuring high levels of dependability in modern computer-based systems has become increasingly challenging due to their complexity. Although systems are validated at design time, their behavior can be different at run-time, possibly showing control-flow anomalies due to "unknown unknowns". Objective: We aim to detect control-flow anomalies through software monitoring, which verifies run-time behavior by logging software execution and detecting deviations from expected control flow. Methods: We propose a methodology to develop software monitors for control-flow anomaly detection through Large Language Models (LLMs) and conformance checking. The methodology builds on existing software development practices to maintain traditional V&V while providing an additional level of robustness and trustworthiness. It leverages LLMs to link design-time models and implementation code, automating source-code instrumentation. The resulting event logs are analyzed via conformance checking, an explainable and effective technique for control-flow anomaly detection. Results: We test the methodology on a case-study scenario from the European Railway Traffic Management System / European Train Control System (ERTMS/ETCS), which is a railway standard for modern interoperable railways. The results obtained from the ERTMS/ETCS case study demonstrate that LLM-based source-code instrumentation can achieve up to 84.775% control-flow coverage of the reference design-time process model, while the subsequent conformance checking-based anomaly detection reaches a peak performance of 96.610% F1-score and 93.515% AUC. Conclusion: Incorporating domain-specific knowledge to guide LLMs in source-code instrumentation significantly allowed obtaining reliable and quality software logs and enabled effective control-flow anomaly detection through conformance checking.


To bind or not to bind? Discovering Stable Relationships in Object-centric Processes (Extended Version)

Seidel, Anjo, Winkler, Sarah, Gianola, Alessandro, Montali, Marco, Weske, Mathias

arXiv.org Artificial Intelligence

Object-centric process mining investigates the intertwined behavior of multiple objects in business processes. From object-centric event logs, object-centric Petri nets (OCPN) can be discovered to replay the behavior of processes accessing different object types. Although they indicate how objects flow through the process and co-occur in events, OCPNs remain underspecified about the relationships of objects. Hence, they are not able to represent synchronization, i.e. executing objects only according to their intended relationships, and fail to identify violating executions. Existing formal modeling approaches, such as object-centric Petri nets with identifiers (OPID), represent object identities and relationships to synchronize them correctly. However, OPID discovery has not yet been studied. This paper uses explicit data models to bridge the gap between OCPNs and formal OPIDs. We identify the implicit assumptions of stable many-to-one relationships in object-centric event logs, which implies synchronization of related objects. To formally underpin this observation, we combine OCPNs with explicit stable many-to-one relationships in a rigorous mapping from OCPNs to OPIDs explicitly capturing the intended stable relationships and the synchronization of related objects. We prove that the original OCPNs and the resulting OPIDs coincide for those executions that satisfy the intended relationships. Moreover, we provide an implementation of the mapping from OCPN to OPID under stable relationships.


Technical Report with Proofs for A Full Picture in Conformance Checking: Efficiently Summarizing All Optimal Alignments

Bär, Philipp, Wynn, Moe T., Leemans, Sander J. J.

arXiv.org Artificial Intelligence

Repeated application of the reduction rules to δ is terminating. None of (R1-R3) increases the size of this set again. We prove local confluency for every pair of rules where the left sides overlap. We only inspect moves where there can be overlapping rules, i.e., (R2,R3) and (R2,R2). Canonicity follows from both propositions together with Newman's Lemma [1].


Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences

Bogdanov, Eli, Cohen, Izack, Gal, Avigdor

arXiv.org Artificial Intelligence

Long event sequences (termed traces) and large data logs that originate from sensors and prediction models are becoming increasingly common in our data-rich world. In such scenarios, conformance checking-validating a data log against an expected system behavior (the process model) can become computationally infeasible due to the exponential complexity of finding an optimal alignment. To alleviate scalability challenges for this task, we propose ConLES, a sliding-window conformance checking approach for long event sequences that preserves the interpretability of alignment-based methods. ConLES partitions traces into manageable subtraces and iteratively aligns each against the expected behavior, leading to significant reduction of the search space while maintaining overall accuracy. We use global information that captures structural properties of both the trace and the process model, enabling informed alignment decisions and discarding unpromising alignments, even if they appear locally optimal. Performance evaluations across multiple datasets highlight that ConLES outperforms the leading optimal and heuristic algorithms for long traces, consistently achieving the optimal or near-optimal solution. Unlike other conformance methods that struggle with long event sequences, ConLES significantly reduces the search space, scales efficiently, and uniquely supports both predefined and discovered process models, making it a viable and leading option for conformance checking of long event sequences.


Object-centric Processes with Structured Data and Exact Synchronization (Extended Version)

Gianola, Alessandro, Montali, Marco, Winkler, Sarah

arXiv.org Artificial Intelligence

Real-world processes often involve interdependent objects that also carry data values, such as integers, reals, or strings. However, existing process formalisms fall short to combine key modeling features, such as tracking object identities, supporting complex datatypes, handling dependencies among them, and object-aware synchronization. Object-centric Petri nets with identifiers (OPIDs) partially address these needs but treat objects as unstructured identifiers (e.g., order and item IDs), overlooking the rich semantics of complex data values (e.g., item prices or other attributes). To overcome these limitations, we introduce data-aware OPIDs (DOPIDs), a framework that strictly extends OPIDs by incorporating structured data manipulation capabilities, and full synchronization mechanisms. In spite of the expressiveness of the model, we show that it can be made operational: Specifically, we define a novel conformance checking approach leveraging satisfiability modulo theories (SMT) to compute data-aware object-centric alignments.


DeclareAligner: A Leap Towards Efficient Optimal Alignments for Declarative Process Model Conformance Checking

Casas-Ramos, Jacobo, Lama, Manuel, Mucientes, Manuel

arXiv.org Artificial Intelligence

In many engineering applications, processes must be followed precisely, making conformance checking between event logs and declarative process models crucial for ensuring adherence to desired behaviors. This is a critical area where Artificial Intelligence (AI) plays a pivotal role in driving effective process improvement. However, computing optimal alignments poses significant computational challenges due to the vast search space inherent in these models. Consequently, existing approaches often struggle with scalability and efficiency, limiting their applicability in real-world settings. This paper introduces DeclareAligner, a novel algorithm that uses the A* search algorithm, an established AI pathfinding technique, to tackle the problem from a fresh perspective leveraging the flexibility of declarative models. Key features of DeclareAligner include only performing actions that actively contribute to fixing constraint violations, utilizing a tailored heuristic to navigate towards optimal solutions, and employing early pruning to eliminate unproductive branches, while also streamlining the process through preprocessing and consolidating multiple fixes into unified actions. The proposed method is evaluated using 8,054 synthetic and real-life alignment problems, demonstrating its ability to efficiently compute optimal alignments by significantly outperforming the current state of the art. By enabling process analysts to more effectively identify and understand conformance issues, DeclareAligner has the potential to drive meaningful process improvement and management.


Control-flow anomaly detection by process mining-based feature extraction and dimensionality reduction

Vitale, Francesco, Pegoraro, Marco, van der Aalst, Wil M. P., Mazzocca, Nicola

arXiv.org Artificial Intelligence

The business processes of organizations may deviate from normal control flow due to disruptive anomalies, including unknown, skipped, and wrongly-ordered activities. To identify these control-flow anomalies, process mining can check control-flow correctness against a reference process model through conformance checking, an explainable set of algorithms that allows linking any deviations with model elements. However, the effectiveness of conformance checking-based techniques is negatively affected by noisy event data and low-quality process models. To address these shortcomings and support the development of competitive and explainable conformance checking-based techniques for control-flow anomaly detection, we propose a novel process mining-based feature extraction approach with alignment-based conformance checking. This variant aligns the deviating control flow with a reference process model; the resulting alignment can be inspected to extract additional statistics such as the number of times a given activity caused mismatches. We integrate this approach into a flexible and explainable framework for developing techniques for control-flow anomaly detection. The framework combines process mining-based feature extraction and dimensionality reduction to handle high-dimensional feature sets, achieve detection effectiveness, and support explainability. The results show that the framework techniques implementing our approach outperform the baseline conformance checking-based techniques while maintaining the explainable nature of conformance checking. We also provide an explanation of why existing conformance checking-based techniques may be ineffective.


Direct Encoding of Declare Constraints in ASP

Chiariello, Francesco, Fionda, Valeria, Ielo, Antonio, Ricca, Francesco

arXiv.org Artificial Intelligence

Answer Set Programming (ASP), a well-known declarative logic programming paradigm, has recently found practical application in Process Mining. In particular, ASP has been used to model tasks involving declarative specifications of business processes. In this area, Declare stands out as the most widely adopted declarative process modeling language, offering a means to model processes through sets of constraints valid traces must satisfy, that can be expressed in Linear Temporal Logic over Finite Traces (LTLf). Existing ASP-based solutions encode Declare constraints by modeling the corresponding LTLf formula or its equivalent automaton which can be obtained using established techniques. In this paper, we introduce a novel encoding for Declare constraints that directly models their semantics as ASP rules, eliminating the need for intermediate representations. We assess the effectiveness of this novel approach on two Process Mining tasks by comparing it with alternative ASP encodings and a Python library for Declare. Under consideration in Theory and Practice of Logic Programming (TPLP).


Skill Learning Using Process Mining for Large Language Model Plan Generation

Redis, Andrei Cosmin, Sani, Mohammadreza Fani, Zarrin, Bahram, Burattin, Andrea

arXiv.org Artificial Intelligence

Large language models (LLMs) hold promise for generating plans for complex tasks, but their effectiveness is limited by sequential execution, lack of control flow models, and difficulties in skill retrieval. Addressing these issues is crucial for improving the efficiency and interpretability of plan generation as LLMs become more central to automation and decision-making. We introduce a novel approach to skill learning in LLMs by integrating process mining techniques, leveraging process discovery for skill acquisition, process models for skill storage, and conformance checking for skill retrieval. Our methods enhance text-based plan generation by enabling flexible skill discovery, parallel execution, and improved interpretability. Experimental results suggest the effectiveness of our approach, with our skill retrieval method surpassing state-of-the-art accuracy baselines under specific conditions.