AITopics | efficiency and fairness

DIFFER:Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Neural Information Processing SystemsDec-27-2025, 03:48:10 GMT

Cooperative multi-agent reinforcement learning (MARL) is a challenging task, as agents must learn complex and diverse individual strategies from a shared team reward. However, existing methods struggle to distinguish and exploit important individual experiences, as they lack an effective way to decompose the team reward into individual rewards. To address this challenge, we propose DIFFER, a powerful theoretical framework for decomposing individual rewards to enable fair experience replay in MARL.By enforcing the invariance of network gradients, we establish a partial differential equation whose solution yields the underlying individual reward function. The individual TD-error can then be computed from the solved closed-form individual rewards, indicating the importance of each piece of experience in the learning task and guiding the training process. Our method elegantly achieves an equivalence to the original learning framework when individual experiences are homogeneous, while also adapting to achieve more muscular efficiency and fairness when diversity is observed.Our extensive experiments on popular benchmarks validate the effectiveness of our theory and method, demonstrating significant improvements in learning efficiency and fairness. Code is available in supplement material.

artificial intelligence, machine learning, reinforcement learning, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.63)

Add feedback

Balancing Efficiency and Fairness: An Iterative Exchange Framework for Multi-UAV Cooperative Path Planning

Li, Hongzong, Liao, Luwei, Dai, Xiangguang, Feng, Yuming, Feng, Rong, Tang, Shiqin

arXiv.org Artificial IntelligenceDec-2-2025

Multi-UAV cooperative path planning (MUCPP) is a fundamental problem in multi-agent systems, aiming to generate collision-free trajectories for a team of unmanned aerial vehicles (UAVs) to complete distributed tasks efficiently. A key challenge lies in achieving both efficiency, by minimizing total mission cost, and fairness, by balancing the workload among UAVs to avoid overburdening individual agents. This paper presents a novel Iterative Exchange Framework for MUCPP, balancing efficiency and fairness through iterative task exchanges and path refinements. The proposed framework formulates a composite objective that combines the total mission distance and the makespan, and iteratively improves the solution via local exchanges under feasibility and safety constraints. For each UAV, collision-free trajectories are generated using A* search over a terrain-aware configuration space. Comprehensive experiments on multiple terrain datasets demonstrate that the proposed method consistently achieves superior trade-offs between total distance and makespan compared to existing baselines.

artificial intelligence, efficiency and fairness, ieee transaction, (13 more...)

arXiv.org Artificial Intelligence

2512.0041

Country:

Asia > China > Chongqing Province > Chongqing (0.05)
Asia > China > Jiangsu Province > Nanjing (0.05)
Asia > China > Hong Kong > Kowloon (0.04)

Genre: Research Report (0.50)

Industry:

Energy (0.68)
Transportation (0.47)
Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.54)

Add feedback

Balancing Efficiency and Fairness in On-Demand Ridesourcing

Nixie S. Lesmana, Xuan Zhang, Xiaohui Bei

Neural Information Processing SystemsNov-16-2025, 01:38:58 GMT

We investigate the problem of assigning trip requests to available vehicles in on-demand ridesourcing.

artificial intelligence, assignment, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Passenger (0.96)
Transportation > Infrastructure & Services (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Balancing Efficiency and Fairness in On-Demand Ridesourcing

Nixie S. Lesmana, Xuan Zhang, Xiaohui Bei

Neural Information Processing SystemsOct-2-2025, 11:41:15 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, assignment, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Passenger (0.96)
Transportation > Infrastructure & Services (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

DIFFER:Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Neural Information Processing SystemsJan-20-2025, 01:53:44 GMT

Cooperative multi-agent reinforcement learning (MARL) is a challenging task, as agents must learn complex and diverse individual strategies from a shared team reward. However, existing methods struggle to distinguish and exploit important individual experiences, as they lack an effective way to decompose the team reward into individual rewards. To address this challenge, we propose DIFFER, a powerful theoretical framework for decomposing individual rewards to enable fair experience replay in MARL.By enforcing the invariance of network gradients, we establish a partial differential equation whose solution yields the underlying individual reward function. The individual TD-error can then be computed from the solved closed-form individual rewards, indicating the importance of each piece of experience in the learning task and guiding the training process. Our method elegantly achieves an equivalence to the original learning framework when individual experiences are homogeneous, while also adapting to achieve more muscular efficiency and fairness when diversity is observed.Our extensive experiments on popular benchmarks validate the effectiveness of our theory and method, demonstrating significant improvements in learning efficiency and fairness. Code is available in supplement material.

artificial intelligence, machine learning, multi-agent reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.99)

Add feedback

Long-term Fairness in Ride-Hailing Platform

Kang, Yufan, Chan, Jeffrey, Shao, Wei, Salim, Flora D., Leckie, Christopher

arXiv.org Artificial IntelligenceJul-25-2024

Matching in two-sided markets such as ride-hailing has recently received significant attention. However, existing studies on ride-hailing mainly focus on optimising efficiency, and fairness issues in ride-hailing have been neglected. Fairness issues in ride-hailing, including significant earning differences between drivers and variance of passenger waiting times among different locations, have potential impacts on economic and ethical aspects. The recent studies that focus on fairness in ride-hailing exploit traditional optimisation methods and the Markov Decision Process to balance efficiency and fairness. However, there are several issues in these existing studies, such as myopic short-term decision-making from traditional optimisation and instability of fairness in a comparably longer horizon from both traditional optimisation and Markov Decision Process-based methods. To address these issues, we propose a dynamic Markov Decision Process model to alleviate fairness issues currently faced by ride-hailing, and seek a balance between efficiency and fairness, with two distinct characteristics: (i) a prediction module to predict the number of requests that will be raised in the future from different locations to allow the proposed method to consider long-term fairness based on the whole timeline instead of consider fairness only based on historical and current data patterns; (ii) a customised scalarisation function for multi-objective multi-agent Q Learning that aims to balance efficiency and fairness. Extensive experiments on a publicly available real-world dataset demonstrate that our proposed method outperforms existing state-of-the-art methods.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2407.17839

Country:

North America > United States > New York (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre:

Research Report > Promising Solution (0.88)
Research Report > New Finding (0.68)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking

Smit, Igor G., Bukhsh, Zaharah, Pechenizkiy, Mykola, Alogariastos, Kostas, Hendriks, Kasper, Zhang, Yingqian

arXiv.org Artificial IntelligenceApr-9-2024

In collaborative human-robot order picking systems, human pickers and Autonomous Mobile Robots (AMRs) travel independently through a warehouse and meet at pick locations where pickers load items onto the AMRs. In this paper, we consider an optimization problem in such systems where we allocate pickers to AMRs in a stochastic environment. We propose a novel multi-objective Deep Reinforcement Learning (DRL) approach to learn effective allocation policies to maximize pick efficiency while also aiming to improve workload fairness amongst human pickers. In our approach, we model the warehouse states using a graph, and define a neural network architecture that captures regional information and effectively extracts representations related to efficiency and workload. We develop a discrete-event simulation model, which we use to train and evaluate the proposed DRL approach. In the experiments, we demonstrate that our approach can find non-dominated policy sets that outline good trade-offs between fairness and efficiency objectives. The trained policies outperform the benchmarks in terms of both efficiency and fairness. Moreover, they show good transferability properties when tested on scenarios with different warehouse sizes. The implementation of the simulation model, proposed approach, and experiments are published.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2404.08006

Country: Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Collaborating Authors

efficiency and fairness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

DIFFER:Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Balancing Efficiency and Fairness: An Iterative Exchange Framework for Multi-UAV Cooperative Path Planning

Balancing Efficiency and Fairness in On-Demand Ridesourcing

Balancing Efficiency and Fairness in On-Demand Ridesourcing

DIFFER:Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Long-term Fairness in Ride-Hailing Platform

Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking