Coordinated Strategies in Realistic Air Combat by Hierarchical Multi-Agent Reinforcement Learning

Selmonaj, Ardian, Del Rio, Giacomo, Schneider, Adrian, Antonucci, Alessandro

arXiv.org Artificial Intelligence 

We focus explicitly on multi-agent RL methods in 3D air combat environments, while the survey [4] also includes single-agent RL and 2D dynamics. Several existing works employ techniques that are relevant to multi-agent air combat, such as tactical reward shaping [5], heterogeneous agents [6], attention-based neural networks for situational awareness [7], or communication mechanisms [8] to improve mission strategies. Curriculum Learning (CL) with gradually increasing task difficulty is applied in [9], while enhanced coordination among agents is achieved by adapted training algorithms [10]. The application of HMARL in defense contexts is comparatively limited. An HMARL approach that employs attention mechanisms and self-play is introduced in [11]. Frameworks more closely related to ours appear in [12], [13], with the former integrating CL and the latter employing heterogeneous leader-follower agents together with JSBSim. In this work, we introduce a complex 3D air combat environment and a training framework to learn hierarchical policies using reward shaping and cascaded league-play that gradually increases mission complexity under realistic and heterogeneous conditions. In contrast to prior efforts that are built on established RL algorithms such as Proximal Policy Optimization (PPO) [14], we additionally adapt the recently presented SPO algorithm [3] to the hierarchical multi-agent domain. To the best of our knowledge, this adapted setup has not yet been studied in this context and represents a significant step toward enhancing the realism of such applications.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found