Meneguzzi, Felipe
CP-MDP: A CANDECOMP-PARAFAC Decomposition Approach to Solve a Markov Decision Process Multidimensional Problem
Kuinchtner, Daniela, Sales, Afonso, Meneguzzi, Felipe
Markov Decision Process (MDP) is the underlying model for optimal planning for decision-theoretic agents in stochastic environments. Although much research focuses on solving MDP problems both in tabular form or using factored representations, none focused on tensor decomposition methods. Solving MDPs using tensor algebra offers the prospect of leveraging advances in tensor-based computations to further increase solver efficiency. In this paper, we develop an MDP solver for a multidimensional problem using a tensor decomposition method to compress the transition models and optimize the value iteration and policy iteration algorithms. We empirically evaluate our approach against tabular methods and show our approach can compute much larger problems using substantially less memory, opening up new possibilities for tensor-based approaches in stochastic planning
Inferring Agents Preferences as Priors for Probabilistic Goal Recognition
Gusmรฃo, Kin Max, Pereira, Ramon Fraga, Meneguzzi, Felipe
Recent approaches to goal recognition have leveraged planning landmarks to achieve high-accuracy with low runtime cost. These approaches, however, lack a probabilistic interpretation. Furthermore, while most probabilistic models to goal recognition assume that the recognizer has access to a prior probability representing, for example, an agent's preferences, virtually no goal recognition approach actually uses the prior in practice, simply assuming a uniform prior. In this paper, we provide a model to both extend landmark-based goal recognition with a probabilistic interpretation and allow the estimation of such prior probability and its usage to compute posterior probabilities after repeated interactions of observed agents. We empirically show that our model can not only recognize goals effectively but also successfully infer the correct prior probability distribution representing an agent's preferences.
Norm Identification through Plan Recognition
Oren, Nir, Meneguzzi, Felipe
Societal rules, as exemplified by norms, aim to provide a degree of behavioural stability to multi-agent societies. Norms regulate a society using the deontic concepts of permissions, obligations and prohibitions to specify what can, must and must not occur in a society. Many implementations of normative systems assume various combinations of the following assumptions: that the set of norms is static and defined at design time; that agents joining a society are instantly informed of the complete set of norms; that the set of agents within a society does not change; and that all agents are aware of the existing norms. When any one of these assumptions is dropped, agents need a mechanism to identify the set of norms currently present within a society, or risk unwittingly violating the norms. In this paper, we develop a norm identification mechanism that uses a combination of parsing-based plan recognition and Hierarchical Task Network (HTN) planning mechanisms, which operates by analysing the actions performed by other agents. While our basic mechanism cannot learn in situations where norm violations take place, we describe an extension which is able to operate in the presence of violations.
Imitating Unknown Policies via Exploration
Gavenski, Nathan, Monteiro, Juarez, Granada, Roger, Meneguzzi, Felipe, Barros, Rodrigo C.
Behavioral cloning is an imitation learning technique that teaches an agent how to behave through expert demonstrations. Recent approaches use self-supervision of fully-observable unlabeled snapshots of the states to decode state-pairs into actions. However, the iterative learning scheme from these techniques are prone to getting stuck into bad local minima. We address these limitations incorporating a two-phase model into the original framework, which learns from unlabeled observations via exploration, substantially improving traditional behavioral cloning by exploiting (i) a sampling mechanism to prevent bad local minima, (ii) a sampling mechanism to improve exploration, and (iii) self-attention modules to capture global features. The resulting technique outperforms the previous state-of-the-art in four different environments by a large margin.
Automated Database Indexing using Model-free Reinforcement Learning
Licks, Gabriel Paludo, Meneguzzi, Felipe
Configuring databases for efficient querying is a complex task, often carried out by a database administrator. Solving the problem of building indexes that truly optimize database access requires a substantial amount of database and domain knowledge, the lack of which often results in wasted space and memory for irrelevant indexes, possibly jeopardizing database performance for querying and certainly degrading performance for updating. We develop an architecture to solve the problem of automatically indexing a database by using reinforcement learning to optimize queries by indexing data throughout the lifetime of a database. In our experimental evaluation, our architecture shows superior performance compared to related work on reinforcement learning and genetic algorithms, maintaining near-optimal index configurations and efficiently scaling to large databases.
Energy-Aware Path Planning for Autonomous Mobile Robot Navigation
Maidana, Renan (Pontifical Catholic University of Rio Grande do Sul ) | Granada, Roger (Pontifical Catholic University of Rio Grande do Sul) | Jurak, Darlan (Pontifical Catholic University of Rio Grande do Sul) | Magnaguagno, Maurรญcio (Pontifical Catholic University of Rio Grande do Sul) | Meneguzzi, Felipe (Pontifical Catholic University of Rio Grande do Sul) | Amory, Alexandre (Pontifical Catholic University of Rio Grande do Sul)
Battery life is yet one of the main limiting factors to a robot's total mission time, and efficient energy management is paramount in a robotic application. In this paper, we integrate energy awareness in the path planning of a mobile robot performing autonomous navigation. Our contributions are: 1) The formalization of a planning domain for mobile robot path planning which accounts for energy consumption and integrates energy actions in the generated plans; 2) A proof of concept of automatic path planning that avoids high energy areas in a known environment. We test our approach in simulation, extending an embedded computer's total battery discharge time by approximately 42.8%, and in a real ground mobile robot, achieving a mean energy draw reduction of 52.02%, both compared to conventional path planning.
The More the Merrier?! Evaluating the Effect of Landmark Extraction Algorithms on Landmark-Based Goal Recognition
Gusmรฃo, Kin Max Piamolini, Pereira, Ramon Fraga, Meneguzzi, Felipe
Recent approaches to goal and plan recognition using classical planning domains have achieved state of the art results in terms of both recognition time and accuracy by using heuristics based on planning landmarks. To achieve such fast recognition time these approaches use efficient, but incomplete, algorithms to extract only a subset of landmarks for planning domains and problems, at the cost of some accuracy. In this paper, we investigate the impact and effect of using various landmark extraction algorithms capable of extracting a larger proportion of the landmarks for each given planning problem, up to exhaustive landmark extraction. We perform an extensive empirical evaluation of various landmark-based heuristics when using different percentages of the full set of landmarks. Results show that having more landmarks does not necessarily mean achieving higher accuracy and lower spread, as the additional extracted landmarks may not necessarily increase be helpful towards the goal recognition task.
Classifying Norm Conflicts using Learned Semantic Representations
Aires, Joรฃo Paulo, Granada, Roger, Monteiro, Juarez, Barros, Rodrigo C., Meneguzzi, Felipe
As natural language uses a diverse and often vague way to express ideas, identifying a norm conflict and its causes While most social norms are informal, they are often in contracts is a challenging task. An ever larger number of formalized by companies in contracts to regulate contracts being currently generated necessitates a fast and reliable trades of goods and services. When poorly process to identify norm conflicts. However, since such written, contracts may contain normative conflicts contracts are written in natural language, traditional revision resulting from opposing deontic meanings or contradict methods involve contract makers reading the contract and specifications. As contracts tend to be identifying conflicting points between norms. Such a method long and contain many norms, manually identifying requires huge human-effort and may not guarantee a revision such conflicts requires human-effort, which is that eliminates all conflicts. In response, we provide three time-consuming and error-prone. Automating such contributions towards automatically identifying and classifying task benefits contract makers increasing productivity potential conflicts between norms in contracts.
Robust Goal Recognition with Operator-Counting Heuristics
Meneguzzi, Felipe, Pereira, Andrรฉ Grahl, Pereira, Ramon Fraga
Goal recognition is the problem of inferring the correct Operator-counting heuristics provide a unifying framework goal towards which an agent executes a plan, for a variety of sources of information from planning heuristics given a set of goal hypotheses, a domain model, [Hoffmann that provide both an estimate ofet al., 2004] and a (possibly noisy) sample of the plan being the total cost of a goal from any given state and and indication executed. This is a key problem in both cooperative of the actual operators likely to be in such plans. This and competitive agent interactions and recent information proves to be effective at differentiating between approaches have produced fast and accurate goal goal hypotheses in goal recognition, as we empirically show recognition algorithms.
Using Sub-Optimal Plan Detection to Identify Commitment Abandonment in Discrete Environments
Pereira, Ramon Fraga, Oren, Nir, Meneguzzi, Felipe
Assessing whether an agent has abandoned a goal or is actively pursuing it is important when multiple agents are trying to achieve joint goals, or when agents commit to achieving goals for each other. Making such a determination for a single goal by observing only plan traces is not trivial as agents often deviate from optimal plans for various reasons, including the pursuit of multiple goals or the inability to act optimally. In this article, we develop an approach based on domain independent heuristics from automated planning, landmarks, and fact partitions to identify sub-optimal action steps - with respect to a plan - within a plan execution trace. Such capability is very important in domains where multiple agents cooperate and delegate tasks among themselves, e.g. through social commitments, and need to ensure that a delegating agent can infer whether or not another agent is actually progressing towards a delegated task. We demonstrate how an agent can use our technique to determine - by observing a trace - whether an agent is honouring a commitment. We empirically show, for a number of representative domains, that our approach infers sub-optimal action steps with very high accuracy and detects commitment abandonment in nearly all cases.