Goto

Collaborating Authors

 drmarl


Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping

arXiv.org Artificial Intelligence

In Amazon robotic sortation warehouses, mobile robots are deployed to transport and sort packages efficiently to different destinations [1, 2, 3, 4, 5]. The sorting process begins at induction stations, where packages are loaded onto mobile robots and subsequently transported to designated eject chutes based on their destinations (Figure 1). A critical factor determining the package throughput capacity of these facilities is the effective allocation of eject chutes to different destinations. Therefore, the destination-to-chute mapping policy plays a crucial role in optimizing the overall throughput performance of the robotic sortation warehouse. Our previous work [6] addresses the destination assignment problem (DAP) [7] in robotic sorting systems by developing a dynamic chute mapping policy. This policy determines the optimal allocation of eject chutes to destinations with the objective of minimizing the number of unsorted packages. We proposed a model-free reinforcement learning approach that dynamically adjusts the number of chutes assigned to each destination throughout the day. Our solution formulates the chute mapping problem within a Multi-Agent Reinforcement Learning (MARL) framework [8, 9, 10, 11], where each destination is represented as an agent that controls its chute allocation at each time step.