Coordinated Strategies in Realistic Air Combat by Hierarchical Multi-Agent Reinforcement Learning
Selmonaj, Ardian, Del Rio, Giacomo, Schneider, Adrian, Antonucci, Alessandro
–arXiv.org Artificial Intelligence
We focus explicitly on multi-agent RL methods in 3D air combat environments, while the survey [4] also includes single-agent RL and 2D dynamics. Several existing works employ techniques that are relevant to multi-agent air combat, such as tactical reward shaping [5], heterogeneous agents [6], attention-based neural networks for situational awareness [7], or communication mechanisms [8] to improve mission strategies. Curriculum Learning (CL) with gradually increasing task difficulty is applied in [9], while enhanced coordination among agents is achieved by adapted training algorithms [10]. The application of HMARL in defense contexts is comparatively limited. An HMARL approach that employs attention mechanisms and self-play is introduced in [11]. Frameworks more closely related to ours appear in [12], [13], with the former integrating CL and the latter employing heterogeneous leader-follower agents together with JSBSim. In this work, we introduce a complex 3D air combat environment and a training framework to learn hierarchical policies using reward shaping and cascaded league-play that gradually increases mission complexity under realistic and heterogeneous conditions. In contrast to prior efforts that are built on established RL algorithms such as Proximal Policy Optimization (PPO) [14], we additionally adapt the recently presented SPO algorithm [3] to the hierarchical multi-agent domain. To the best of our knowledge, this adapted setup has not yet been studied in this context and represents a significant step toward enhancing the realism of such applications.
arXiv.org Artificial Intelligence
Oct-23-2025
- Country:
- Asia
- China (0.04)
- Middle East > Jordan (0.04)
- Europe > Switzerland (0.05)
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Government > Military (1.00)
- Technology: