Balancing Safety and Exploitability in Opponent Modeling

Wang, Zhikun (Max Planck Institute for Intelligent Systems) | Boularias, Abdeslam (Max Planck Institute for Intelligent Systems) | Mülling, Katharina (Max Planck Institute for Intelligent Systems) | Peters, Jan (Max Planck Institute for Intelligent Systems)

Aug-4-2011–AAAI Conferences

Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.

artificial intelligence, machine learning, opponent, (16 more...)

AAAI Conferences

Aug-4-2011

Conferences PDF

Add feedback

Country:
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Industry:
- Leisure & Entertainment
  - Games > Computer Games (0.71)
  - Sports > Tennis (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Robots (0.97)
  - Representation & Reasoning > Agents (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found