AITopics | qmix

URB - Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles

Neural Information Processing SystemsJun-18-2026, 16:29:16 GMT

Connected Autonomous Vehicles (CAVs) promise to reduce congestion in future urban networks, potentially by optimizing their routing decisions. Unlike for human drivers, these decisions can be made with collective, data-driven policies, developed using machine learning algorithms. Reinforcement learning (RL) can facilitate the development of such collective routing strategies, yet standardized and realistic benchmarks are missing.

machine learning, reinforcement learning, scenario 1, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Transportation > Infrastructure & Services (0.68)
Transportation > Ground > Road (0.68)
Consumer Products & Services > Travel (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Encouraging metric-aware diversity in contrastive representation space

Neural Information Processing SystemsJun-17-2026, 11:02:41 GMT

In cooperative Multi-Agent Reinforcement Learning (MARL), agents that share policy network parameters often learn similar behaviors, which hinders effective exploration and can lead to suboptimal cooperative policies. Recent advances have attempted to promote multi-agent diversity by leveraging the Wasserstein distance to increase policy differences. However, these methods cannot effectively encourage diverse policies due to ineffective Wasserstein distance caused by the policy similarity. To address this limitation, we propose Wasserstein Contrastive Diversity (WCD) exploration, a novel approach that promotes multi-agent diversity by maximizing the Wasserstein distance between the trajectory distributions of different agents in a latent representation space. To make the Wasserstein distance meaningful, we propose a novel next-step prediction method based on Contrastive Predictive Coding (CPC) to learn distinguishable trajectory representations. Additionally, we introduce an optimized kernel-based method to compute the Wasserstein distance more efficiently. Since the Wasserstein distance is inherently defined for two distributions, we extend it to support multiple agents, enabling diverse policy learning. Empirical evaluations across a variety of challenging multi-agent tasks demonstrate that WCD outperforms existing state-of-the-art methods, delivering superior performance and enhanced exploration.

artificial intelligence, machine learning, wasserstein distance, (16 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.68)
Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (0.94)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

0a113ef6b61820daa5611c870ed8d5ee-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 14:57:39 GMT

artificial intelligence, machine learning, qmix, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A unifying view of contrastive learning, importance sampling, and bridge sampling for energy-based models

Martino, Luca

arXiv.org Machine LearningApr-10-2026

In the last decades, energy-based models (EBMs) have become an important class of probabilistic models in which a component of the likelihood is intractable and therefore cannot be evaluated explicitly. Consequently, parameter estimation in EBMs is challenging for conventional inference methods. In this work, we provide a unified framework that connects noise contrastive estimation (NCE), reverse logistic regression (RLR), multiple importance sampling (MIS), and bridge sampling within the context of EBMs. We further show that these methods are equivalent under specific conditions. This unified perspective clarifies relationships among existing methods and enables the development of new estimators, with the potential to improve statistical and computational efficiency. Furthermore, this study helps elucidate the success of NCE in terms of its flexibility and robustness, while also identifying scenarios in which its performance can be further improved. Hence, rather than being a purely descriptive review, this work offers a unifying perspective and additional methodological contributions. The MATLAB code used in the numerical experiments is also made freely available to support the reproducibility of the results.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2604.08116

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

e9e140df6de01afb672cb859d203c307-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 18:21:26 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(8 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

a7a7c0c92f195cce85f99768621ac6c0-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-16-2026, 09:29:57 GMT

artificial intelligence, experiment, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Energy > Renewable > Wind (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Add feedback

MAVEN: Multi-Agent Variational Exploration

Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson

Neural Information Processing SystemsFeb-15-2026, 04:47:12 GMT

Wemodel 34], whichisformallyG = hS, U, Pi. S is thestatespacet, every i 2 A {1,..., n} choosesui 2 U which action u 2 U Un. P(s0|s,u): S U S!

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Super Hard

Neural Information Processing SystemsFeb-15-2026, 04:46:57 GMT

We thank all the reviewers for their feedback. All reviewers are concerned whether we substantially outperform QMIX. Since StarCraft II experiments take a long time, we could not include all the results in the submission. Samvelyan et al. have classified as Easy, Hard & Super Hard. Results on several maps are shown below.

artificial intelligence, exploration, representational constraint, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback