DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning
Yuan, Tingting, Chung, Hwei-Ming, Yuan, Jie, Fu, Xiaoming
–arXiv.org Artificial Intelligence
Secondly, the communication improves its policy iteratively by learning from observations delay can interfere with the cooperation between agents to achieve a given goal. RL, with a single agent to decide by introducing delays in action-making (Chen et al. 2021) the behavior of all entities, faces various challenges, such and uncertainty on the arrival time of information. Previous as scalability (Yan et al. 2021) and privacy issues (Yuan, work (Kim et al. 2019) prevents endless waiting by setting Chung, and Fu 2022). To this end, the extension from singleagent a predefined and constant bound for the waiting time, but it RL to multi-agent RL (MARL) (Hernandez-Leal, Kartal, may restrain potential cooperation if it is set too short and and Taylor 2019) is favorable. MARL (Hernandez-Leal, conversely may cause meaningless waiting. Therefore, such Kartal, and Taylor 2019) has been widely used in various a constant timer is inflexible and cannot be adapted to the tasks, such as real-time resource allocation (Yuan et al. dynamics in the communication networks.
arXiv.org Artificial Intelligence
Dec-3-2022
- Country:
- North America
- United States
- New York (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Colorado > Denver County
- Denver (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Norway > Eastern Norway
- Oslo (0.04)
- Germany > Lower Saxony
- Gottingen (0.14)
- Spain > Catalonia
- Asia > China
- North America
- Genre:
- Research Report (0.82)
- Industry:
- Information Technology > Security & Privacy (0.86)
- Technology: