global agent
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > Singapore (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > Singapore (0.04)
Adaptive AI-based Decentralized Resource Management in the Cloud-Edge Continuum
Li, Lanpei, Bell, Jack, Coppola, Massimo, Lomonaco, Vincenzo
The increasing complexity of application requirements and the dynamic nature of the Cloud-Edge Continuum present significant challenges for efficient resource management. These challenges stem from the ever-changing infrastructure, which is characterized by additions, removals, and reconfigurations of nodes and links, as well as the variability of application workloads. Traditional centralized approaches struggle to adapt to these changes due to their static nature, while decentralized solutions face challenges such as limited global visibility and coordination overhead. This paper proposes a hybrid decentralized framework for dynamic application placement and resource management. The framework utilizes Graph Neural Networks (GNNs) to embed resource and application states, enabling comprehensive representation and efficient decision-making. It employs a collaborative multi-agent reinforcement learning (MARL) approach, where local agents optimize resource management in their neighborhoods and a global orchestrator ensures system-wide coordination. By combining decentralized application placement with centralized oversight, our framework addresses the scalability, adaptability, and accuracy challenges inherent in the Cloud-Edge Continuum. This work contributes to the development of decentralized application placement strategies, the integration of GNN embeddings, and collaborative MARL systems, providing a foundation for efficient, adaptive and scalable resource management.
- Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning
Anand, Emile, Karmarkar, Ishani, Qu, Guannan
Reinforcement Learning (RL) has become a popular learning framework to solve sequential decision making problems in unknown environments, and has achieved tremendous success in a wide array of domains such as playing the game of Go (Silver et al., 2016), robotic control (Kober et al., 2013), and autonomous driving (Kiran et al., 2022; Lin et al., 2023). A critical feature of most real-world systems is their uncertain nature, and consequently RL has emerged as a powerful tool for learning optimal policies for multi-agent systems to operate in unknown environments (Kim & Giannakis, 2017; Zhang et al., 2021; Lin et al., 2024; Anand & Qu, 2024). While the early literature on RL predominantly focused on the single-agent setting, multi-agent reinforcement learning (MARL) has also recently achieved impressive successes in a broad range of areas, such as coordination of robotic swarms (Preiss et al., 2017), self-driving vehicles (DeWeese & Qu, 2024), real-time bidding (Jin et al., 2018), ride-sharing (Li et al., 2019), and stochastic games (Jin et al., 2020). Despite growing interest in multi-agent RL (MARL), extending RL to multi-agent settings poses significant computational challenges due to the curse of dimensionality (Sayin et al., 2021). Even if the individual agents' state or action spaces are small, the global state space or action space can take values from a set with size that is exponentially large as a function of the number of agents.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Multi-Agent Deep Reinforcement Learning for Energy Efficient Multi-Hop STAR-RIS-Assisted Transmissions
Liao, Pei-Hsiang, Shen, Li-Hsiang, Wu, Po-Chen, Feng, Kai-Ten
Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) provides a promising way to expand coverage in wireless communications. However, limitation of single STAR-RIS inspire us to integrate the concept of multi-hop transmissions, as focused on RIS in existing research. Therefore, we propose the novel architecture of multi-hop STAR-RISs to achieve a wider range of full-plane service coverage. In this paper, we intend to solve active beamforming of the base station and passive beamforming of STAR-RISs, aiming for maximizing the energy efficiency constrained by hardware limitation of STAR-RISs. Furthermore, we investigate the impact of the on-off state of STAR-RIS elements on energy efficiency. To tackle the complex problem, a Multi-Agent Global and locAl deep Reinforcement learning (MAGAR) algorithm is designed. The global agent elevates the collaboration among local agents, which focus on individual learning. In numerical results, we observe the significant improvement of MAGAR compared to the other benchmarks, including Q-learning, multi-agent deep Q network (DQN) with golbal reward, and multi-agent DQN with local rewards. Moreover, the proposed architecture of multi-hop STAR-RISs achieves the highest energy efficiency compared to mode switching based STAR-RISs, conventional RISs and deployment without RISs or STAR-RISs.
Cooperative Multi-Objective Reinforcement Learning for Traffic Signal Control and Carbon Emission Reduction
Tang, Cheng Ruei, Hsieh, Jun Wei, Teng, Shin You
Existing traffic signal control systems rely on oversimplified rule-based methods, and even RL-based methods are often suboptimal and unstable. To address this, we propose a cooperative multi-objective architecture called Multi-Objective Multi-Agent Deep Deterministic Policy Gradient (MOMA-DDPG), which estimates multiple reward terms for traffic signal control optimization using age-decaying weights. Our approach involves two types of agents: one focuses on optimizing local traffic at each intersection, while the other aims to optimize global traffic throughput. We evaluate our method using real-world traffic data collected from an Asian country's traffic cameras. Despite the inclusion of a global agent, our solution remains decentralized as this agent is no longer necessary during the inference stage. Our results demonstrate the effectiveness of MOMA-DDPG, outperforming state-of-the-art methods across all performance metrics. Additionally, our proposed system minimizes both waiting time and carbon emissions. Notably, this paper is the first to link carbon emissions and global agents in traffic signal control.
- Oceania > Australia (0.04)
- Europe (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Energy (1.00)
MANSA: Learning Fast and Slow in Multi-Agent Systems
Mguni, David, Chen, Haojun, Jafferjee, Taher, Wang, Jianhong, Fei, Long, Feng, Xidong, McAleer, Stephen, Tong, Feifei, Wang, Jun, Yang, Yaodong
In multi-agent reinforcement learning (MARL), independent learning (IL) often shows remarkable performance and easily scales with the number of agents. Yet, using IL can be inefficient and runs the risk of failing to successfully train, particularly in scenarios that require agents to coordinate their actions. Using centralised learning (CL) enables MARL agents to quickly learn how to coordinate their behaviour but employing CL everywhere is often prohibitively expensive in real-world applications. Besides, using CL in value-based methods often needs strong representational constraints (e.g. individual-global-max condition) that can lead to poor performance if violated. In this paper, we introduce a novel plug & play IL framework named Multi-Agent Network Selection Algorithm (MANSA) which selectively employs CL only at states that require coordination. At its core, MANSA has an additional agent that uses switching controls to quickly learn the best states to activate CL during training, using CL only where necessary and vastly reducing the computational burden of CL. Our theory proves MANSA preserves cooperative MARL convergence properties, boosts IL performance and can optimally make use of a fixed budget on the number CL calls. We show empirically in Level-based Foraging (LBF) and StarCraft Multi-agent Challenge (SMAC) that MANSA achieves fast, superior and more reliable performance while making 40% fewer CL calls in SMAC and using CL at only 1% CL calls in LBF.
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China (0.04)
- Energy (0.67)
- Leisure & Entertainment > Games > Computer Games (0.34)
Cooperative Reinforcement Learning on Traffic Signal Control
Chao, Chi-Chun, Hsieh, Jun-Wei, Wang, Bor-Shiun
Traffic signal control is a challenging real-world problem aiming to minimize overall travel time by coordinating vehicle movements at road intersections. Existing traffic signal control systems in use still rely heavily on oversimplified information and rule-based methods. Specifically, the periodicity of green/red light alternations can be considered as a prior for better planning of each agent in policy optimization. To better learn such adaptive and predictive priors, traditional RL-based methods can only return a fixed length from predefined action pool with only local agents. If there is no cooperation between these agents, some agents often make conflicts to other agents and thus decrease the whole throughput. This paper proposes a cooperative, multi-objective architecture with age-decaying weights to better estimate multiple reward terms for traffic signal control optimization, which termed COoperative Multi-Objective Multi-Agent Deep Deterministic Policy Gradient (COMMA-DDPG). Two types of agents running to maximize rewards of different goals - one for local traffic optimization at each intersection and the other for global traffic waiting time optimization. The global agent is used to guide the local agents as a means for aiding faster learning but not used in the inference phase. We also provide an analysis of solution existence together with convergence proof for the proposed RL optimization. Evaluation is performed using real-world traffic data collected using traffic cameras from an Asian country. Our method can effectively reduce the total delayed time by 60\%. Results demonstrate its superiority when compared to SoTA methods.
- Oceania > Australia (0.04)
- Asia > Taiwan (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)