cbba
Task Allocation for Autonomous Machines using Computational Intelligence and Deep Reinforcement Learning
Nguyen, Thanh Thi, Nguyen, Quoc Viet Hung, Kua, Jonathan, Razzak, Imran, Nguyen, Dung, Nahavandi, Saeid
Enabling multiple autonomous machines to perform reliably requires the development of efficient cooperative control algorithms. This paper presents a survey of algorithms that have been developed for controlling and coordinating autonomous machines in complex environments. We especially focus on task allocation methods using computational intelligence (CI) and deep reinforcement learning (RL). The advantages and disadvantages of the surveyed methods are analysed thoroughly. We also propose and discuss in detail various future research directions that shed light on how to improve existing algorithms or create new methods to enhance the employability and performance of autonomous machines in real-world applications. The findings indicate that CI and deep RL methods provide viable approaches to addressing complex task allocation problems in dynamic and uncertain environments. The recent development of deep RL has greatly contributed to the literature on controlling and coordinating autonomous machines, and it has become a growing trend in this area. It is envisaged that this paper will provide researchers and engineers with a comprehensive overview of progress in machine learning research related to autonomous machines. It also highlights underexplored areas, identifies emerging methodologies, and suggests new avenues for exploration in future research within this domain.
SPACE: A Python-based Simulator for Evaluating Decentralized Multi-Robot Task Allocation Algorithms
Swarm robotics explores the coordination of multiple robots to achieve collective goals, with collective decision-making being a central focus. This process involves decentralized robots autonomously making local decisions and communicating them, which influences the overall emergent behavior. Testing such decentralized algorithms in real-world scenarios with hundreds or more robots is often impractical, underscoring the need for effective simulation tools. We propose SPACE (Swarm Planning and Control Evaluation), a Python-based simulator designed to support the research, evaluation, and comparison of decentralized Multi-Robot Task Allocation (MRTA) algorithms. SPACE streamlines core algorithmic development by allowing users to implement decision-making algorithms as Python plug-ins, easily construct agent behavior trees via an intuitive GUI, and leverage built-in support for inter-agent communication and local task awareness. To demonstrate its practical utility, we implement and evaluate CBBA and GRAPE within the simulator, comparing their performance across different metrics, particularly in scenarios with dynamically introduced tasks. This evaluation shows the usefulness of SPACE in conducting rigorous and standardized comparisons of MRTA algorithms, helping to support future research in the field.
An Auction-based Coordination Strategy for Task-Constrained Multi-Agent Stochastic Planning with Submodular Rewards
Liu, Ruifan, Shin, Hyo-Sang, Yan, Binbin, Tsourdos, Antonios
Abstract--In many domains such as transportation and logistics, search and rescue, or cooperative surveillance, tasks are pending to be allocated with the consideration of possible execution uncertainties. Existing task coordination algorithms either ignore the stochastic process or suffer from the computational intensity. Taking advantage of the'weakly coupled' feature of the problem and the opportunity for coordination in advance, we propose a decentralized auction-based coordination strategy using a newly formulated score function which is generated by forming the problem into task-constrained Markov decision processes (MDPs). The proposed method guarantees convergence and at least 50% optimality in the premise of a submodular reward function. Furthermore, for the implementation on large-scale applications, an approximate variant of the proposed method, namely Deep Auction, is also suggested with the use of neural networks, which is evasive of the troublesome for constructing MDPs. Inspired by the well-known actor-critic architecture, two Transformers are used to map observations to action probabilities and cumulative rewards respectively. Finally, we demonstrate the performance of the two proposed approaches in the context of drone deliveries, where the stochastic planning for the drone league is cast into a stochastic price-collecting Vehicle Routing Problem (VRP) with time windows. Simulation results are compared with state-of-the-art methods in terms of solution quality, planning efficiency and scalability. Cooperative systems of multiple agents, which features a flexible structure, parallel-processing ability, and scalability, are of great interest, especially for those operating on the unmanned aerial platform [1], such as cooperative surveillance, search and rescue [2][3], border patrolling, etc. Among its various instantiations, there is a specific but widespread category of assigning tasks among team members with a global goal, followed by an independent and possibly stochastic task execution. For example, in deliveries of parcels [4], tasks are allocated to individual vehicles, and the vehicles deliver their allocated items in sequence, without interference from other executors. However, the delivery may subject to stochastic travel delays between destinations. Also, in multi-target tracking [5], agents decide the tracking targets and the best action to track based on the estimation of targe manoeuvring.
DrMaMP: Distributed Real-time Multi-agent Mission Planning in Cluttered Environment
Lu, Zehui, Zhou, Tianyu, Mou, Shaoshuai
Solving a collision-aware multi-agent mission planning (task allocation and path finding) problem is challenging due to the requirement of real-time computational performance, scalability, and capability of handling static/dynamic obstacles and tasks in a cluttered environment. This paper proposes a distributed real-time (on the order of millisecond) algorithm DrMaMP, which partitions the entire unassigned task set into subsets via approximation and decomposes the original problem into several single-agent mission planning problems. This paper presents experiments with dynamic obstacles and tasks and conducts optimality and scalability comparisons with an existing method, where DrMaMP outperforms the existing method in both indices. Finally, this paper analyzes the computational burden of DrMaMP which is consistent with the observations from comparisons, and presents the optimality gap in small-size problems.
Multi-Robot Auctions for Allocation of Tasks with Temporal Constraints
Nunes, Ernesto (University of Minnesota) | Gini, Maria (University of Minnesota)
We propose an auction algorithm to allocate tasks that have temporal constraints to cooperative robots. Temporal constraints are expressed as time windows, within which a task must be executed. There are no restrictions on the time windows, which are allowed to overlap. Robots model their temporal constraints using a simple temporal network, enabling them to maintain consistent schedules. When bidding on a task, a robot takes into account its own current commitments and an optimization objective, which is to minimize the time of completion of the last task alone or in combination with minimizing the distance traveled. The algorithm works both when all the tasks are known upfront and when tasks arrive dynamically. We show the performance of the algorithm in simulation with different numbers of tasks and robots, and compare it with a baseline greedy algorithm and a state-of-the-art auction algorithm. Our algorithm is computationally frugal and consistently allocates more tasks than the competing algorithms.
Actor-Critic Policy Learning in Cooperative Planning
Redding, Joshua (Massachusetts Institute of Technology) | Geramifard, Alborz (Massachusetts Institute of Technology) | How, Jonathan (Massachusetts Institute of Technology)
In this paper, we introduce a method for learning and adapting cooperative control strategies in real-time stochastic domains. Our framework is an instance of the intelligent cooperative control architecture (iCCA). The agent starts by following the "safe" plan calculated by the planning module and incrementally adapting the policy to maximize rewards. Actor-critic and consensus-based bundle algorithm (CBBA) were employed as the building blocks of the iCCA framework. We demonstrate the performance of our approach by simulating limited fuel unmanned aerial vehicles aiming for stochastic targets. The integrated framework boosted the optimality of the solution by 10 percent compared to running each of the modules individually.