Robust Reinforcement Learning Under Minimax Regret for Green Security

Xu, Lily, Perrault, Andrew, Fang, Fei, Chen, Haipeng, Tambe, Milind

Jun-15-2021–arXiv.org Artificial Intelligence

Green security domains feature defenders who plan patrols in the face of uncertainty about the adversarial behavior of poachers, illegal loggers, and illegal fishers. Importantly, the deterrence effect of patrols on adversaries' future behavior makes patrol planning a sequential decision-making problem. Therefore, we focus on robust sequential patrol planning for green security following the minimax regret criterion, which has not been considered in the literature. We formulate the problem as a game between the defender and nature who controls the parameter values of the adversarial behavior and design an algorithm MIRROR to find a robust policy. MIRROR uses two reinforcement learning-based oracles and solves a restricted game considering limited defender strategies and parameter values. We evaluate MIRROR on real-world poaching data.

max-regret game, oracle, proc, (15 more...)

arXiv.org Artificial Intelligence

Jun-15-2021

arXiv.org PDF

Add feedback

Country:
- Asia > Cambodia (0.14)
- Africa > Uganda (0.04)
- North America > United States
  - California (0.14)

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment > Games (1.00)

Technology:
- Information Technology
  - Game Theory (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found