If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Non-player characters (NPCs) in video games are a common form of frustration for players because they generally provide no explanations for their actions or provide simplistic explanations using fixed scripts. Motivated by this, we consider a new design for agents that can learn about their environments, accomplish a range of goals, and explain what they are doing to a supervisor. We propose a framework for studying this type of agent, and compare it to existing reinforcement learning and self-motivated agent frameworks. We propose a novel design for an initial agent that acts within this framework. Finally, we describe an evaluation centered around the supervisor's satisfaction and understanding of the agent's behavior.
Learning is an important aspect of human intelligence. People learn from various aspects of their experience over time. We present an episodic infrastructure for learning in the context of a cognitive architecture, \icarus/. After a review of this architecture, we formally define the architectural extensions for episodic capabilities. We then demonstrate the extended system's capability to learn planning operators using the episodic traces from two Minecraft-like scenarios.
Molineaux, Matthew (Knexus Research Corporation) | Floyd, Michael W. (Knexus Research Corporation) | Dannenhauer, Dustin (United States Naval Research Laboratory) | Aha, David W. (United States Naval Research Laboratory)
Human-agent teaming is a difficult yet relevant problem domain to which many goal reasoning systems are well suited, due to their ability to accept outside direction and (relatively) human-understandable internal state. We propose a formal model, and multiple variations on a multi-agent problem, to clarify and unify research in goal reasoning. We describe examples of these concepts, and propose standard evaluation methods for goal reasoning agents that act as a member of a team or on behalf of a supervisor.
Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum. Yet few studies examine how they compare to each other, when one might prefer one approach, or how they may complement each other. As a first step in this direction, we compare reward shaping, hints, and curricula for a Deep RL agent in the game of Minecraft. We seek to answer whether reward shaping, visual hints, or the curricula have the most impact on performance, which we measure as the time to reach the target, the distance from the target, the cumulative reward, or the number of actions taken. Our analyses show that performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, the results suggest that designing an effective curriculum and providing appropriate hints most improve the performance. Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum, yet few studies examine how they compare to each other. We compare these approaches for a Deep RL agent in the game of Minecraft and show performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, this suggests that designing an effective curriculum with hints most improve the performance.
Temporal logics have been used in autonomous planning to represent and reason about temporal planning problems. However, such techniques have typically been restricted to either (1) representing actions, events, and goals with temporal properties or (2) planning for temporally-extended goals under restrictive assumptions. We introduce Mixed Propositional Metric Temporal Logic (MPMTL) where formulae are built over mixed binary and continuous real variables. We introduce a planner, MTP, that solves MPMTL problems and includes a SAT-solver, model checker for a polynomial fragment of MPMTL, and a forward search algorithm. We extend PDDL 2.1 with MPMTL syntax to create MPDDL and an associated parser. The empirical study shows that MTP outperforms the state-of-the-art PDDL+ planner SMTPlan+ on several domains it performed best on and MTP performs and scales on problem size well for challenging domains with rich temporal properties we create.
Learning by observation allows non-technical experts to transfer their skills to an agent by shifting the knowledge-transfer task to the agent. However, for the agent to learn regardless of expert, domain, or observed behavior, it must learn in a general-purpose manner. Existing learning by observation agents allow for domain-independent learning and reasoning but require human intervention to model the agent’s inputs and outputs. We describe Domain-Independent Deep Feature Learning by Observation (DIDFLO), an agent that uses convolutional neural networks to learn without explicitly defining input features. DIDFLO uses the raw visual inputs at two levels of granularity to automatically learn input features using limited training data. We evaluate DIDFLO in scenarios drawn from a simulated soccer domain and provide a comparison to other learning by observation agents in this domain.
Heuristics serve as a powerful tool in modern domain-independent planning (DIP) systems by providing critical guidance during the search for high-quality solutions. However, they have not been broadly used with hierarchical planning techniques, which are more expressive and tend to scale better in complex domains by exploiting additional domain-specific knowledge. Complicating matters, we show that for Hierarchical Goal Network (HGN) planning, a goal-based hierarchical planning formalism that we focus on in this paper, any poly-time heuristic that is derived from a delete-relaxation DIP heuristic has to make some relaxation of the hierarchical semantics. To address this, we present a principled framework for incorporating DIP heuristics into HGN planning using a simple relaxation of the HGN semantics we call Hierarchy-Relaxation. This framework allows for computing heuristic estimates of HGN problems using any DIP heuristic in an admissibility-preserving manner. We demonstrate the feasibility of this approach by using the LMCut heuristic to guide an optimal HGN planner. Our empirical results with three benchmark domains demonstrate that simultaneously leveraging hierarchical knowledge and heuristic guidance substantially improves planning performance.
Sci-fi narratives permeating the collective consciousness endow AI Rebellion with ample negative connotations. However, for AI agents, as for humans, attitudes of protest, objection, and rejection have many potential benefits in support of ethics, safety, self-actualization, solidarity, and social justice, and are necessary in a wide variety of contexts. We launch a conversation on constructive AI rebellion and describe a framework meant to support discussion, implementation, and deployment of AI Rebel Agents as protagonists of positive narratives.
Coman, Alexandra (National Research Council/Naval Research Laboratory) | Johnson, Benjamin (National Research Council/Naval Research Laboratory) | Briggs, Gordon (National Research Council/Naval Research Laboratory) | Aha, David W. (Naval Research Laboratory)
Human attitudes of objection, protest, and rebellion have undeniable potential to bring about social benefits, from social justice to healthy balance in relationships. At times, they can even be argued to be ethically obligatory. Conversely, AI rebellion is largely seen as a dangerous, destructive prospect. With the increase of interest in collaborative human/AI environments in which synthetic agents play social roles or, at least, exhibit behavior with social and ethical implications, we believe that AI rebellion could have benefits similar to those of its counterpart in humans. We introduce a framework meant to help categorize and design Rebel Agents, discuss their social and ethical implications, and assess their potential benefits and the risks they may pose. We also present AI rebellion scenarios in two considerably different contexts (military unmanned vehicles and computational social creativity) that exemplify components of the framework.
Menager, David (University of Kansas) | Choi, Dongkyu (University of Kansas) | Floyd, Michael W. (Knexus Research Corporation) | Task, Christine (Knexus Research Corporation) | Aha, David W. (Naval Research Laboratory)
For robots to work with humans as a team, they need to be aware of what their teammates are doing. Since it is unrealistic to expect humans to constantly communicate their goals and intentions, it is crucial for the robots to accurately and autonomously recognize their teammates’ goals. Furthermore, as these human–robot teams may perform a variety of missions in dynamically changing contexts, the teammates’ goals may change suddenly without warning. Historically, goal recognition systems have not directly addressed this is- sue, but have predominantly focused on situations with static goals. This paper presents a windowing strategy that enables goal recognition systems to detect goal changes in a fast and accurate manner. We describe this novel approach, and show its benefits through an empirical study spanning three different domains.