Goto

Collaborating Authors

 drmarl


Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping

Liu, Guangyi, Iloglu, Suzan, Caldara, Michael, Durham, Joseph W., Zavlanos, Michael M.

arXiv.org Artificial Intelligence

In Amazon robotic sortation warehouses, mobile robots are deployed to transport and sort packages efficiently to different destinations [1, 2, 3, 4, 5]. The sorting process begins at induction stations, where packages are loaded onto mobile robots and subsequently transported to designated eject chutes based on their destinations (Figure 1). A critical factor determining the package throughput capacity of these facilities is the effective allocation of eject chutes to different destinations. Therefore, the destination-to-chute mapping policy plays a crucial role in optimizing the overall throughput performance of the robotic sortation warehouse. Our previous work [6] addresses the destination assignment problem (DAP) [7] in robotic sorting systems by developing a dynamic chute mapping policy. This policy determines the optimal allocation of eject chutes to destinations with the objective of minimizing the number of unsorted packages. We proposed a model-free reinforcement learning approach that dynamically adjusts the number of chutes assigned to each destination throughout the day. Our solution formulates the chute mapping problem within a Multi-Agent Reinforcement Learning (MARL) framework [8, 9, 10, 11], where each destination is represented as an agent that controls its chute allocation at each time step.