Iteratively Learn Diverse Strategies with State Distance Information

Dec-24-2025, 23:43:07 GMT–Neural Information Processing Systems

In complex reinforcement learning (RL) problems, policies with similar rewards may have substantially different behaviors. It remains a fundamental challenge to optimize rewards while also discovering as many strategies as possible, which can be crucial in many practical applications. Our study examines two design choices for tackling this challenge, i.e., and . First, we find that with existing diversity measures, visually indistinguishable policies can still yield high diversity scores. To accurately capture the behavioral difference, we propose to incorporate the state-space distance information into the diversity measure.

iteratively learn diverse strategy, name change, state distance information, (5 more...)

Neural Information Processing Systems

Dec-24-2025, 23:43:07 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.57)
  - Machine Learning > Reinforcement Learning (0.40)