Balancing Safety and Exploitability in Opponent Modeling
Wang, Zhikun (Max Planck Institute for Intelligent Systems) | Boularias, Abdeslam (Max Planck Institute for Intelligent Systems) | Mülling, Katharina (Max Planck Institute for Intelligent Systems) | Peters, Jan (Max Planck Institute for Intelligent Systems)
Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.
Aug-4-2011
- Country:
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Industry:
- Leisure & Entertainment
- Games > Computer Games (0.71)
- Sports > Tennis (0.69)
- Leisure & Entertainment
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Robots (0.97)
- Representation & Reasoning > Agents (0.69)
- Information Technology > Artificial Intelligence