initial situation
Homotopy Based Reinforcement Learning with Maximum Entropy for Autonomous Air Combat
Zhu, Yiwen, Fang, Zhou, Zheng, Yuan, Wei, Wenya
The Intelligent decision of the unmanned combat aerial vehicle (UCAV) has long been a challenging problem. The conventional search method can hardly satisfy the real-time demand during high dynamics air combat scenarios. The reinforcement learning (RL) method can significantly shorten the decision time via using neural networks. However, the sparse reward problem limits its convergence speed and the artificial prior experience reward can easily deviate its optimal convergent direction of the original task, which raises great difficulties for the RL air combat application. In this paper, we propose a homotopy-based soft actor-critic method (HSAC) which focuses on addressing these problems via following the homotopy path between the original task with sparse reward and the auxiliary task with artificial prior experience reward. The convergence and the feasibility of this method are also proved in this paper. To confirm our method feasibly, we construct a detailed 3D air combat simulation environment for the RL-based methods training firstly, and we implement our method in both the attack horizontal flight UCAV task and the self-play confrontation task. Experimental results show that our method performs better than the methods only utilizing the sparse reward or the artificial prior experience reward. The agent trained by our method can reach more than 98.3% win rate in the attack horizontal flight UCAV task and average 67.4% win rate when confronted with the agents trained by the other two methods.
Planning with Incomplete Information in Quantified Answer Set Programming
Fandinno, Jorge, Laferriรจre, Franรงois, Romero, Javier, Schaub, Torsten, Son, Tran Cao
We present a general approach to planning with incomplete information in Answer Set Programming (ASP). More precisely, we consider the problems of conformant and conditional planning with sensing actions and assumptions. We represent planning problems using a simple formalism where logic programs describe the transition function between states, the initial states and the goal states. For solving planning problems, we use Quantified Answer Set Programming (QASP), an extension of ASP with existential and universal quantifiers over atoms that is analogous to Quantified Boolean Formulas (QBFs). We define the language of quantified logic programs and use it to represent the solutions to different variants of conformant and conditional planning. On the practical side, we present a translation-based QASP solver that converts quantified logic programs into QBFs and then executes a QBF solver, and we evaluate experimentally the approach on conformant and conditional planning benchmarks.
Towards A Logical Account of Epistemic Causality
Khan, Shakil M., Soutchanski, Mikhail
Reasoning about observed effects and their causes is important in multi-agent contexts. While there has been much work on causality from an objective standpoint, causality from the point of view of some particular agent has received much less attention. In this paper, we address this issue by incorporating an epistemic dimension to an existing formal model of causality. We define what it means for an agent to know the causes of an effect. Then using a counterexample, we prove that epistemic causality is a different notion from its objective counterpart. 1 Introduction Research on actual causality involves finding in a given narrative (trace) the event that caused an effect. Pearl [25, 26] was a pioneer to lead a computational enquiry in actual causality. The research was later continued by Halpern and Pearl [12, 15] and others [8, 17, 18, 13, 14]. Unfortunately, as argued by Glymour et al. [9], most of these accounts are developed by analyzing a handful of simple examples, and then validated relative to our intuition for these examples, a process which G oรler et al. [11] referred to as TEGAR (i.e. As such, even after multiple revisions, these definitions continue to suffer from various conceptual problems such as the early preemption problem and the over-determination problem. For instance, despite claims to the contrary, the definitions given in [14] suffer from the problem of preemption, which occurs when two competing events try to achieve the same effect and the latter of these fails to do so as the earlier one has already achieved the effect (see [31] and [4] for a discussion). In an attempt to address these issues, Batusov and Soutchanski [2, 3] recently proposed a new definition of actual causality that is based on a well developed and expressive formalization of actions and change, namely the situation calculus [23, 27]. The definition is derived from first principles and does not follow a TEGAR scheme.
Reasoning about Discrete and Continuous Noisy Sensors and Effectors in Dynamical Systems
Belle, Vaishak, Levesque, Hector J.
Among the many approaches for reasoning about degrees of belief in the presence of noisy sensing and acting, the logical account proposed by Bacchus, Halpern, and Levesque is perhaps the most expressive. While their formalism is quite general, it is restricted to fluents whose values are drawn from discrete finite domains, as opposed to the continuous domains seen in many robotic applications. In this work, we show how this limitation in that approach can be lifted. By dealing seamlessly with both discrete distributions and continuous densities within a rich theory of action, we provide a very general logical specification of how belief should change after acting and sensing in complex noisy domains.
Infinite Paths in the Situation Calculus: Axiomatization and Properties
Khan, Shakil M. (York University) | Lespรฉrance, Yves (York University)
The situation calculus has proved to be a very popular formalism for modeling and reasoning about dynamic systems. This otherwise elegant and refined language however lacks a natural way of dealing with "infinite future histories". To this end, in this paper we introduce a new sort ranging over infinite paths in the situation calculus and propose an axiomatization for infinite paths. We thus obtain a convenient way of specifying several kinds of notions that involve infinite futures such as temporal properties of non-terminating executions of agents or programs and mental attitudes such as desires and intentions. We prove the correctness of the axiomatization and show that our formalization has some intuitively desirable properties.
Compiling Uncertainty Away in Conformant Planning Problems with Bounded Width
Palacios, Hector, Geffner, Hector
Conformant planning is the problem of finding a sequence of actions for achieving a goal in the presence of uncertainty in the initial state or action effects. The problem has been approached as a path-finding problem in belief space where good belief representations and heuristics are critical for scaling up. In this work, a different formulation is introduced for conformant problems with deterministic actions where they are automatically converted into classical ones and solved by an off-the-shelf classical planner. The translation maps literals L and sets of assumptions t about the initial situation, into new literals KL/t that represent that L must be true if t is initially true. We lay out a general translation scheme that is sound and establish the conditions under which the translation is also complete. We show that the complexity of the complete translation is exponential in a parameter of the problem called the conformant width, which for most benchmarks is bounded. The planner based on this translation exhibits good performance in comparison with existing planners, and is the basis for T0, the best performing planner in the Conformant Track of the 2006 International Planning Competition.
Reasoning for Moving Blocks Problem: Formal Representation and Implementation
The combined approach of the Qualitative Reasoning and Probabilistic Functions for the knowledge representation is proposed. The method aims at represent uncertain, qualitative knowledge that is essential for the moving blocks task's execution. The attempt to formalize the commonsense knowledge is performed with the Situation Calculus language for reasoning and robot's beliefs representation. The method is implemented in the Prolog programming language and tested for a specific simulated scenario. In most cases the implementation enables us to solve a given task, i.e., move blocks to desired positions. The example of robot's reasoning and main parts of the implemented program's code are presented.
Probabilistic Reasoning about Actions in Nonmonotonic Causal Theories
Eiter, Thomas, Lukasiewicz, Thomas
We present the language {m P}{cal C}+ for probabilistic reasoning about actions, which is a generalization of the action language {cal C}+ that allows to deal with probabilistic as well as nondeterministic effects of actions. We define a formal semantics of {m P}{cal C}+ in terms of probabilistic transitions between sets of states. Using a concept of a history and its belief state, we then show how several important problems in reasoning about actions can be concisely formulated in our formalism.
Bounded Situation Calculus Action Theories and Decidable Verification
Giacomo, Giuseppe De (Sapienza Universita') | Lesperance, Yves (di Roma) | Patrizi, Fabio (York University)
We define a notion of bounded action theory in the situation calculus, where the theory entails that in all situations, the number of ground fluent atoms is bounded by a constant. Such theories can still have an infinite domain and an infinite set of states. We argue that such theories are fairly common in applications, either because facts do not persist indefinitely or because one eventually forgets some facts, as one learns new ones. We discuss various ways of obtaining bounded action theories. The main result of the paper is that verification of an expressive class of first-order $\mu$-calculus temporal properties in such theories is in fact decidable. This paper is an abridged version of a paper appeared in KR'12.
Belief change with noisy sensing in the situation calculus
Ma, Jianbing, Liu, Weiru, Miller, Paul
Situation calculus has been applied widely in artificial intelligence to model and reason about actions and changes in dynamic systems. Since actions carried out by agents will cause constant changes of the agents' beliefs, how to manage these changes is a very important issue. Shapiro et al. [22] is one of the studies that considered this issue. However, in this framework, the problem of noisy sensing, which often presents in real-world applications, is not considered. As a consequence, noisy sensing actions in this framework will lead to an agent facing inconsistent situation and subsequently the agent cannot proceed further. In this paper, we investigate how noisy sensing actions can be handled in iterated belief change within the situation calculus formalism. We extend the framework proposed in [22] with the capability of managing noisy sensings. We demonstrate that an agent can still detect the actual situation when the ratio of noisy sensing actions vs. accurate sensing actions is limited. We prove that our framework subsumes the iterated belief change strategy in [22] when all sensing actions are accurate. Furthermore, we prove that our framework can adequately handle belief introspection, mistaken beliefs, belief revision and belief update even with noisy sensing, as done in [22] with accurate sensing actions only.