Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration

Xing, Xu, Rongpeng, Li, Zhifeng, Zhao, Honggang, Zhang

arXiv.org Artificial Intelligence 

--With the rapid evolution of wireless mobile devices, it emerges stronger incentive to design proper collaboration mechanisms among the intelligent agents. Following their individual observations, multiple intelligent agents could cooperate and gradually approach the final collective objective through continuously learning from the environment. In that regard, independent reinforcement learning (IRL) is often deployed within the multi-agent collaboration to alleviate the dilemma of non-stationary learning environment. However, behavioral strategies of the intelligent agents in IRL could only be formulated upon their local individual observations of the global environment, and appropriate communication mechanisms must be introduced to reduce their behavioral localities. In this paper, we tackle the communication problem among the intelligent agents in IRL by jointly adopting two mechanisms with different scales. For the large scale, we introduce the stigmergy mechanism as an indirect communication bridge among the independent learning agents and carefully design a mathematical representation to indicate the impact of digital pheromone. For the small scale, we propose a conflict-avoidance mechanism between adjacent agents by implementing an additionally embedded neural network to provide more opportunities for participants with higher action priorities. Besides, we also present a federal training method to effectively optimize the neural networks within each agent in a decentralized manner . Finally, we establish a simulation scenario where a number of mobile agents in a certain area move automatically to form a specified target shape, and demonstrate the superiorities of our proposed methods through extensive simulations. I NTRODUCTION With the rapid development of mobile wireless communication and IoTs (Internet of Things) technologies, many scenarios gradually arise where the collaboration among the involved intelligent agents is highly required, such as the deployment of unmanned aerial vehicles (UA Vs) [1]-[3], the distributed control in the field of industry automation [4]-[6], and mobile crowd sensing and computing (MCSC) [7], [8]. In these scenarios, traditional centralized control methods are usually impracticable because of the restriction from limited computing resources as well as the demand for ultra-low latency and ultra-high reliability. As an alternative, multi-agent collaboration can be introduced into these scenarios to reduce the pressure at the central controller side. As one of the primary goals in the field of artificial intelligence (AI), assisting autonomous agents to act optimally through the "trial-and-error" interaction process with the expected environment is regarded as an important target of reinforcement learning (RL) [9]-[11].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found