Planning & Scheduling
Holmgård
Is it possible to conduct player modeling without any players? In this paper we use Monte-Carlo Tree Search-controlled procedural personas to simulate a range of decision making styles in the puzzle game MiniDungeons 2. The purpose is to provide a method for synthetic play testing of game levels with synthetic players based on designer intuition and experience. Five personas are constructed, representing five different decision making styles archetypal for the game. The personas vary solely in the weights of decision-making utilities that describe their valuation of a set affordances in MiniDungeons 2. By configuring these weights using designer expert knowledge, and passing the configurations directly to the MCTS algorithm, we make the personas exhibit a number of distinct decision making and play styles.
Cardona-Rivera
Interactive narratives suffer from the narrative paradox: the tension that exists between providing a coherent narrative experience and allowing a player free reign over what she can manipulate in the environment. Knowing what actions a player in such an environment intends to carry out would help in managing the narrative paradox, since it would allow us to anticipate potential threats to the intended narrative experience and potentially mediate or eliminate them. The process of observing player actions and attempting to come up with an explanation for those actions (i.e. the plan that the player is trying to carry out) is the problem of plan recognition. We adopt the framing of narratives as plans and leverage recent advances that cast plan recognition as planning to develop a symbolic plan recognition system as a proof-of-concept model of a player's reasoning in an interactive narrative environment. In this paper we outline the system architecture, report on performance metrics that demonstrate adequate performance for non-trivial domains, and discuss the implications of treating players as plan recognizers.
Horswill
Players in Dear Leader's Happy Story Time are placed in the role of contestants in a reality TV show where they are forced to audition for roles in the upcoming film of the host, a deranged billionaire who has inexplicably been elected president. The stories are produced by a story generator that combines stock plots and characters to produce kitsch story outlines. The players then collaborate to improvise a camp performance of the outline. The game design provides a context for experimenting with automatic story generation within a narrative game, as well as an opportunity for experimenting with knowledge representation schemes for expressing the tropes of popular narrative. The story generator uses a higher-order logic for describing tropes, and an HTN planning algorithm based on Nau et al.'s SHOP.
Geib
This paper presents a new model of cooperative behavior based on the interaction of plan recognition and automated planning. Based on observations of the actions of an "initiator" agent, a "supporter" agent uses plan recognition to hypothesize the plans and goals of the initiator. The supporter agent then proposes and plans for a set of subgoals it will achieve to help the initiator. The approach is demonstrated in an open-source, virtual robot platform.
Azad
A live interactive narrative (LIN) is an experience where multiple players take on fictional roles and interact with real-world objects and actors to participate in a pre-authored narrative. Temporal properties of LINs are important to its viability and aesthetic quality and hence deserve special design consideration. In this paper, we tackle the largely overlooked problem of scheduling a multiplayer interactive narrative and propose the Live Interactive Narrative Scheduling Problem (LINSP), which handles reasoning under temporal uncertainty, resource scheduling, and non-linear plot choices. We present a mixed-integer linear programming formulation of the problem and empirically evaluates its scalability over large narrative instances.
Dobre
We present a suite of techniques for extending the Partially Observable Monte Carlo Planning algorithm to handle complex multi-agent games. We design the planning algorithm to exploit the inherent structure of the game. When game rules naturally cluster the actions into sets called types, these can be leveraged to extract characteristics and high-level strategies from a sparse corpus of human play. Another key insight is to account for action legality both when extracting policies from game play and when these are used to inform the forward sampling method. We evaluate our algorithm against other baselines and versus ablated versions of itself in the well-known board game Settlers of Catan.
GrASP: Gradient-Based Affordance Selection for Planning
Veeriah, Vivek, Zheng, Zeyu, Lewis, Richard, Singh, Satinder
Planning with a learned model is arguably a key component of intelligence. There are several challenges in realizing such a component in large-scale reinforcement learning (RL) problems. One such challenge is dealing effectively with continuous action spaces when using tree-search planning (e.g., it is not feasible to consider every action even at just the root node of the tree). In this paper we present a method for selecting affordances useful for planning -- for learning which small number of actions/options from a continuous space of actions/options to consider in the tree-expansion process during planning. We consider affordances that are goal-and-state-conditional mappings to actions/options as well as unconditional affordances that simply select actions/options available in all states. Our selection method is gradient based: we compute gradients through the planning procedure to update the parameters of the function that represents affordances. Our empirical work shows that it is feasible to learn to select both primitive-action and option affordances, and that simultaneously learning to select affordances and planning with a learned value-equivalent model can outperform model-free RL.
ExPoSe: Combining State-Based Exploration with Gradient-Based Online Search
Mittal, Dixant, Aravindan, Siddharth, Lee, Wee Sun
A tree-based online search algorithm iteratively simulates trajectories and updates Q-value information on a set of states represented by a tree structure. Alternatively, policy gradient based online search algorithms update the information obtained from simulated trajectories directly onto the parameters of the policy and has been found to be effective. While tree-based methods limit the updates from simulations to the states that exist in the tree and do not interpolate the information to nearby states, policy gradient search methods do not do explicit exploration. In this paper, we show that it is possible to combine and leverage the strengths of these two methods for improved search performance. We examine the key reasons behind the improvement and propose a simple yet effective online search method, named Exploratory Policy Gradient Search (ExPoSe), that updates both the parameters of the policy as well as search information on the states in the trajectory. We conduct experiments on complex planning problems, which include Sokoban and Hamiltonian cycle search in sparse graphs and show that combining exploration with policy gradient improves online search performance.
Augmented Business Process Management Systems: A Research Manifesto
Dumas, Marlon, Fournier, Fabiana, Limonad, Lior, Marrella, Andrea, Montali, Marco, Rehse, Jana-Rebecca, Accorsi, Rafael, Calvanese, Diego, De Giacomo, Giuseppe, Fahland, Dirk, Gal, Avigdor, La Rosa, Marcello, Völzer, Hagen, Weber, Ingo
These opportunities require a significant shift in the way the BPMS operates and interacts with its operators(both human and digital agents). While traditional BPMSs encode pre-defined flows and rules, an ABPMS is able to reason about the current state of the process(or across several processes) to determine a course of action that improves the performance of the process. To fully exploit this capability, the ABPMS needs a degree of autonomy. Naturally, this autonomy needs to be framed by operational assumptions, goals, and environmental constraints. Also, ABPMSs need to engage conversationally with human agents, they need to explain their actions, and they need to recommend adaptations or improvements in the way the process is performed. This manifesto outlined a number of research challenges that need to be overcome to realize systems that exhibit these characteristics.
Augmented Business Process Management Systems: A Research Manifesto
In this direction, a number of techniques from the field of AI have been applied to BPMSs with the aim of increasing the degree of automated process adaptation (Marrella, 2018, 2019). In (Gajewski et al., 2005; Ferreira and Ferreira, 2006; Marrella and Lespérance, 2013, 2017), if a task failure occurs at run-time and leads to a process goal violation, a new complete process definition that complies with the goal is generated relying on a partial-order AI planner. As a side effect, this often significantly modifies the assignment of tasks to process participants. The work (Bucchiarone et al., 2011) proposes a goal-driven approach to adapt processes to run-time context changes. Process and context changes that prevent goal achievement are specified at design-time and recovery strategies are built at run-time through an adaptation mechanism based on service composition via AI planning.