Not enough data to create a plot.
Try a different view from the menu above.
Patrizi, Fabio
Exploiting Multiple Abstractions in Episodic RL via Reward Shaping
Cipollone, Roberto, De Giacomo, Giuseppe, Favorito, Marco, Iocchi, Luca, Patrizi, Fabio
One major limitation to the applicability of Reinforcement Learning (RL) to many practical domains is the large number of samples required to learn an optimal policy. To address this problem and improve learning efficiency, we consider a linear hierarchy of abstraction layers of the Markov Decision Process (MDP) underlying the target domain. Each layer is an MDP representing a coarser model of the one immediately below in the hierarchy. In this work, we propose a novel form of Reward Shaping where the solution obtained at the abstract level is used to offer rewards to the more concrete MDP, in such a way that the abstract solution guides the learning in the more complex domain. In contrast with other works in Hierarchical RL, our technique has few requirements in the design of the abstract models and it is also tolerant to modeling errors, thus making the proposed approach practical. We formally analyze the relationship between the abstract models and the exploration heuristic induced in the lower-level domain. Moreover, we prove that the method guarantees optimal convergence and we demonstrate its effectiveness experimentally.
Optimal Alignment of Temporal Knowledge Bases
Fernandez-Gil, Oliver, Patrizi, Fabio, Perelli, Giuseppe, Turhan, Anni-Yasmin
Answering temporal CQs over temporalized Description Logic knowledge bases (TKB) is a main technique to realize ontology-based situation recognition. In case the collected data in such a knowledge base is inaccurate, important query answers can be missed. In this paper we introduce the TKB Alignment problem, which computes a variant of the TKB that minimally changes the TKB, but entails the given temporal CQ and is in that sense (cost-)optimal. We investigate this problem for ALC TKBs and conjunctive queries with LTL operators and devise a solution technique to compute (cost-optimal) alignments of TKBs that extends techniques for the alignment problem for propositional LTL over finite traces.
Mimicking Behaviors in Separated Domains
De Giacomo, Giuseppe (Sapienza University of Rome) | Fried, Dror (The Open University of Israel) | Patrizi, Fabio (Sapienza University of Rome) | Zhu, Shufang (a:1:{s:5:"en_US";s:27:"Sapienza Univeristy of Rome";})
Devising a strategy to make a system mimic behaviors from another system is a problem that naturally arises in many areas of Computer Science. In this work, we interpret this problem in the context of intelligent agents, from the perspective of ltlf , a formalism commonly used in AI for expressing finite-trace properties. Our model consists of two separated dynamic domains, DA and DB, and an LTLf specification that formalizes the notion of mimicking by mapping properties on behaviors (traces) of DA into properties on behaviors of DB. The goal is to synthesize a strategy that step-by-step maps every behavior of DA into a behavior of DB so that the specification is met. We consider several forms of mapping specifications, ranging from simple ones to full LTLf , and for each, we study synthesis algorithms and computational properties.
Monitoring Hybrid Process Specifications with Conflict Management: The Automata-theoretic Approach
Alman, Anti, Maggi, Fabrizio Maria, Montali, Marco, Patrizi, Fabio, Rivkin, Andrey
Business process monitoring approaches have thus far mainly focused on monitoring the execution of a process with respect to a single process model. However, in some cases it is necessary to consider multiple process specifications simultaneously. In addition, these specifications can be procedural, declarative, or a combination of both. For example, in the medical domain, a clinical guideline describing the treatment of a specific disease cannot account for all possible co-factors that can coexist for a specific patient and therefore additional constraints may need to be considered. In some cases, these constraints may be incompatible with clinical guidelines, therefore requiring the violation of either the guidelines or the constraints. In this paper, we propose a solution for monitoring the interplay of hybrid process specifications expressed as a combination of (data-aware) Petri nets and temporal logic rules. During the process execution, if these specifications are in conflict with each other, it is possible to violate some of them. The monitoring system is equipped with a violation cost model according to which the system can recommend the next course of actions in a way that would either avoid possible violations or minimize the total cost of violations.
Reinforcement Learning for LTLf/LDLf Goals
De Giacomo, Giuseppe, Iocchi, Luca, Favorito, Marco, Patrizi, Fabio
MDPs extended with LTLf/LDLf non-Markovian rewards have recently attracted interest as a way to specify rewards declaratively. In this paper, we discuss how a reinforcement learning agent can learn policies fulfilling LTLf/LDLf goals. In particular we focus on the case where we have two separate representations of the world: one for the agent, using the (predefined, possibly low-level) features available to it, and one for the goal, expressed in terms of high-level (human-understandable) fluents. We formally define the problem and show how it can be solved. Moreover, we provide experimental evidence that keeping the RL agent feature space separated from the goal's can work in practice, showing interesting cases where the agent can indeed learn a policy that fulfills the LTLf/LDLf goal using only its features (augmented with additional memory).
Situation Calculus for Synthesis of Manufacturing Controllers
De Giacomo, Giuseppe, Logan, Brian, Felli, Paolo, Patrizi, Fabio, Sardina, Sebastian
Manufacturing is transitioning from a mass production model to a manufacturing as a service model in which manufacturing facilities 'bid' to produce products. To decide whether to bid for a complex, previously unseen product, a manufacturing facility must be able to synthesize, 'on the fly', a process plan controller that delegates abstract manufacturing tasks in the supplied process recipe to the appropriate manufacturing resources, e.g., CNC machines, robots etc. Previous work in applying AI behaviour composition to synthesize process plan controllers has considered only finite state ad-hoc representations. Here, we study the problem in the relational setting of the Situation Calculus. By taking advantage of recent work on abstraction in the Situation Calculus, process recipes and available resources are represented by ConGolog programs over, respectively, an abstract and a concrete action theory. This allows us to capture the problem in a formal, general framework, and show decidability for the case of bounded action theories. We also provide techniques for actually synthesizing the controller.
LTLf/LDLf Non-Markovian Rewards
Brafman, Ronen I. (Ben-Gurion University) | Giacomo, Giuseppe De (Sapienza University of Rome) | Patrizi, Fabio (Sapienza University of Rome)
In Markov Decision Processes (MDPs), the reward obtained in a state is Markovian, i.e., depends on the last state and action. This dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request. Extending MDPs to handle non-Markovian reward functions was the subject of two previous lines of work. Both use LTL variants to specify the reward function and then compile the new model back into a Markovian model. Building on recent progress in temporal logics over finite traces, we adopt LDLf for specifying non-Markovian rewards and provide an elegant automata construction for building a Markovian model, which extends that of previous work and offers strong minimality and compositionality guarantees.
On the Disruptive Effectiveness of Automated Planning for LTL f -Based Trace Alignment
Giacomo, Giuseppe De (Sapienza - Università di Roma) | Maggi, Fabrizio Maria (University of Tartu) | Marrella, Andrea (Sapienza - Università di Roma) | Patrizi, Fabio (Sapienza - Università di Roma)
One major task in business process management is that of aligning real process execution traces to a process model by (minimally) introducing and eliminating steps. Here, we look at declarative process specifications expressed in Linear Temporal Logic on finite traces (LTLf). We provide a sound and complete technique to synthesize the alignment instructions relying on finite automata theoretic manipulations. Such a technique can be effectively implemented by using planning technology. Notably, the resulting planning-based alignment system significantly outperforms all current state-of-the-art ad-hoc alignment systems. We report an in-depth experimental study that supports this claim.
Verifying ConGolog Programs on Bounded Situation Calculus Theories
Giacomo, Giuseppe De (Sapienza University of Rome) | Lespérance, Yves (York University) | Patrizi, Fabio (Free University of Bozen-Bolzano) | Sardina, Sebastian (RMIT University)
We address verification of high-level programs over situation calculus action theories that have an infinite object domain, but bounded fluent extensions in each situation. We show that verification of mu-calculus temporal properties against ConGolog programs over such bounded theories is decidable in general. To do this, we reformulate the transition semantics of ConGolog to keep the bindings of “pick variables” into a separate variable environment whose size is naturally bounded by the number of variables. We also show that for situation-determined ConGolog programs, we can compile away the program into the action theory itself without loss of generality. This can also be done for arbitrary programs, but only to check certain properties, such as if a situation is the result of a program execution, not for mu-calculus verification.
Bounded Epistemic Situation Calculus Theories
Giacomo, Giuseppe De (Università di Roma "La Sapienza") | Lespérance, Yves (York University, Toronto) | Patrizi, Fabio (Università di Roma "La Sapienza")
We define the class of e-bounded theories in the epistemic situation calculus, where the number of fluent atoms that the agent thinks may be true is bounded by a constant. Such theories can still have an infinite domain and an infinite set of states. We show that for them verification of an expressive class of first-order mu-calculus temporal epistemic properties is decidable. We also show that if the agent’s knowledge in the initial situation is e-bounded and the objective part of an action theory maintains boundedness, then the entire epistemic theory is e-bounded.