Recent progress in Game AI has demonstrated that given enough data from human gameplay, or experience gained via simulations, machines can rival or surpass the most skilled human players in classic games such as Go, or commercial computer games such as Starcraft. We review the current state-of-the-art through the lens of wargaming, and ask firstly what features of wargames distinguish them from the usual AI testbeds, and secondly which recent AI advances are best suited to address these wargame-specific features.
We present Neural A*, a novel data-driven search algorithm for path planning problems. Although data-driven planning has received much attention in recent years, little work has focused on how search-based methods can learn from demonstrations to plan better. In this work, we reformulate a canonical A* search algorithm to be differentiable and couple it with a convolutional encoder to form an end-to-end trainable neural network planner. Neural A* solves a path planning problem by (1) encoding a visual representation of the problem to estimate a movement cost map and (2) performing the A* search on the cost map to output a solution path. By minimizing the difference between the search results and ground-truth paths in demonstrations, the encoder learns to capture a variety of visual planning cues in input images, such as shapes of dead-end obstacles, bypasses, and shortcuts, which makes estimated cost maps informative. Our extensive experiments confirmed that Neural A* (a) outperformed state-of-the-art data-driven planners in terms of the search optimality and efficiency trade-off and (b) predicted realistic pedestrian paths by directly performing a search on raw image inputs.
Limited power and computational resources, absence of high-end sensor equipment and GPS-denied environments are challenges faced by autonomous micro areal vehicles (MAVs). We address these challenges in the context of autonomous navigation and landing of MAVs in indoor environments and propose a vision-based control approach using Supervised Learning. To achieve this, we collected data samples in a simulation environment which were labelled according to the optimal control command determined by a path planning algorithm. Based on these data samples, we trained a Convolutional Neural Network (CNN) that maps low resolution image and sensor input to high-level control commands. We have observed promising results in both obstructed and non-obstructed simulation environments, showing that our model is capable of successfully navigating a MAV towards a landing platform. Our approach requires shorter training times than similar Reinforcement Learning approaches and can potentially overcome the limitations of manual data collection faced by comparable Supervised Learning approaches.
This paper introduces and studies a graph-based variant of the path planning problem arising in hostile environments. We consider a setting where an agent (e.g. a robot) must reach a given destination while avoiding being intercepted by probabilistic entities which exist in the graph with a given probability and move according to a probabilistic motion pattern known a priori. Given a goal vertex and a deadline to reach it, the agent must compute the path to the goal that maximizes its chances of survival. We study the computational complexity of the problem, and present two algorithms for computing high quality solutions in the general case: an exact algorithm based on Mixed-Integer Nonlinear Programming, working well in instances of moderate size, and a pseudo-polynomial time heuristic algorithm allowing to solve large scale problems in reasonable time. We also consider the two limit cases where the agent can survive with probability 0 or 1, and provide specialized algorithms to detect these kinds of situations more efficiently.
E.g., its search space was shown to be compatible with symbolic Goal Recognition [Amado et al., 2018]. We achieved a new milestone in the difficult task One major drawback of the previous work was that it used of enabling agents to learn about their environment a non-descriptive, black-box neural model as the successor autonomously. Our neuro-symbolic architecture is generator. Not only that such a black-box model is incompatible trained end-to-end to produce a succinct and effective with the existing heuristic search techniques, but also, discrete state transition model from images since a neural network is able to model a very complex function, alone. Our target representation (the Planning Domain its direct translation into a compact logical formula via Definition Language) is already in a form that a rule-based transfer learning method turned out futile [Asai, off-the-shelf solvers can consume, and opens the 2019a]: The model complexity causes an exponentially large door to the rich array of modern heuristic search grounded action model that cannot be processed by the modern capabilities. We demonstrate how the sophisticated classical planners. Thus, obtaining the descriptive action innate prior we place on the learning process significantly models from the raw observations with minimal human interference reduces the complexity of the learned representation, is the next key milestone for expanding the scope of and reveals a connection to the graphtheoretic applying Automated Planning to the raw unstructured inputs.
People often plan hierarchically. That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those. Although much work explores how people decompose tasks, there is less analysis of why people decompose tasks in the way they do. Here, we address this question by formalizing task decomposition as a resource-rational representation problem. Specifically, we propose that people decompose tasks in a manner that facilitates efficient use of limited cognitive resources given the structure of the environment and their own planning algorithms. Using this model, we replicate several existing findings. Our account provides a normative explanation for how people identify subtasks as well as a framework for studying how people reason, plan, and act using resource-rational representations.
OMBINATORIAL optimization problems arise in various and heterogeneous domains such as routing, combinatorial challenges. We note that the inherent structure scheduling, planning, decision-making processes, transportation of the problems in numerous fields or the data itself is that of and telecommunications, and therefore have a direct a graph . In this light, it is of paramount interest to examine impact on practical scenarios . Existing approaches suffer the potential of machine learning for addressing combinatorial from certain limitations when applied to practical problems: optimization problems on graphs and in particular, for forbidding execution time and the need to hand engineer overcoming the limitations of the traditional approaches.
Earth observation resources are becoming increasingly indispensable in disaster relief, damage assessment and related domains. Many unpredicted factors, such as the change of observation task requirements, to the occurring of bad weather and resource failures, may cause the scheduled observation scheme to become infeasible. Therefore, it is crucial to be able to promptly and maybe frequently develop high-quality replanned observation schemes that minimize the effects on the scheduled tasks. A bottom-up distributed coordinated framework together with an improved contract net are proposed to facilitate the dynamic task replanning for heterogeneous Earth observation resources. This hierarchical framework consists of three levels, namely, neighboring resource coordination, single planning center coordination, and multiple planning center coordination. Observation tasks affected by unpredicted factors are assigned and treated along with a bottom-up route from resources to planning centers. This bottom-up distributed coordinated framework transfers part of the computing load to various nodes of the observation systems to allocate tasks more efficiently and robustly. To support the prompt assignment of large-scale tasks to proper Earth observation resources in dynamic environments, we propose a multiround combinatorial allocation (MCA) method. Moreover, a new float interval-based local search algorithm is proposed to obtain the promising planning scheme more quickly. The experiments demonstrate that the MCA method can achieve a better task completion rate for large-scale tasks with satisfactory time efficiency. It also demonstrates that this method can help to efficiently obtain replanning schemes based on original scheme in dynamic environments.
Autonomous and semi-autonomous systems for safety-critical applications require rigorous testing before deployment. Due to the complexity of these systems, formal verification may be impossible and real-world testing may be dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a black box during testing. Safety validation tasks include finding disturbances to the system that cause it to fail (falsification), finding the most-likely failure, and estimating the probability that the system fails. Motivated by the prevalence of safety-critical artificial intelligence, this work provides a survey of state-of-the-art safety validation techniques with a focus on applied algorithms and their modifications for the safety validation problem. We present and discuss algorithms in the domains of optimization, path planning, reinforcement learning, and importance sampling. Problem decomposition techniques are presented to help scale algorithms to large state spaces, and a brief overview of safety-critical applications is given, including autonomous vehicles and aircraft collision avoidance systems. Finally, we present a survey of existing academic and commercially available safety validation tools.
Analyzing encircling situation is the most crucial part of autonomous adaptation. Since there are many unknown and constantly changing factors in the real environment, momentary adjustment to the consistently alternating circumstances is highly required for addressing autonomy. To respond properly to changing environment, an utterly self-ruling vehicle ought to have the capacity to realize/comprehend its particular position and the surrounding environment. However, these vehicles extremely rely on human involvement to resolve entangled missions that cannot be precisely characterized in advance, which restricts their applications and accuracy. Reducing dependence on human supervision can be achieved by improving level of autonomy. Over the previous decades, autonomy and mission planning have been extensively researched on different structures and diverse conditions; nevertheless, aiming at robust mission planning in extreme conditions, here we provide exhaustive study of UVs autonomy as well as its related properties in internal and external situation awareness. In the following discussion, different difficulties in the scope of AUVs and UAVs will be discussed.