RoShamBo, or Rock-Paper-Scissors, is a classic children's game and a common example used in game theory. Two players simultaneously choose and reveal one of three actions. Rock beats scissors, scissors beats paper, paper beats rock, and the two players tie if they choose the same action.
While the game can be played a single time, it is much more interesting if a long series of repeated games is played, or if it is played in a round-robin tournament with weak opponents. This is because it is possible to play the game badly: if a player only picks one action, such as Always-Rock, then their opponent can learn about that tendency and change their strategy to Always-Paper in order to beat them. On the other hand, it is easy for people to discover the optimal strategy (a unique Nash equilibrium) by trying to pick uniformly at randomly between the actions: 1/3 rock, 1/3 paper, 1/3 scissors. If they can randomly pick between the actions (which is difficult, as people are typically bad at choosing randomly) then their opponent cannot do anything to win more than 1/3 or lose less than 1/3 of the games. The player picking uniformly at random will do no better and no worse than tie against any opponent, regardless of if the opponent is playing predictably and exploitably (such as Always-Rock) or unexploitably (Uniform Random). This can make the game too simple, as it is too easy to not lose.
In computer RoShamBo tournaments, such as Darse Billings' famous RoShamBo Programming Competition, the competitors play a round-robin tournament where every pair of programs play a long series of games. However, to make the tournament interesting, the competitors also play against a set of unknown weak opponents who have some predictable tendency, and the competitors do not know during a match if they are playing against another competitor or a weak opponent. This means that the Uniform Random strategy described above is unlikely to win, since on average it will tie against every opponent. On the other hand, any competitor that can observe and discover their opponent's weakness (if any) can change their strategy to exploit them and do much better than tie, and place higher in the overall tournament rankings. The presence of the weak opponents create a rich ecosystem where the competitors will attempt to model and bluff and outsmart each other, instead of simply playing the Uniform Random strategy. This approach to tournament design has since been applied to other domains, such as the Annual Computer Poker Competition, to encourage more research on opponent modelling and adaptation instead of using unexploitable Nash equilibrium strategies that do not try to maximize their winnings.
Challenges for Artificial Intelligence Research:
The challenge of RoShamBo tournaments is one of online opponent modelling and adaptation. As a match is ongoing, an agent needs to examine its opponent's past behavior, try to extrapolate what it will do on the next decision, and choose what to do in response in order to maximize its long term winnings. This may not be simple -- a supposedly weak opponent may be trying to mislead the agent, so that it can itself be exploited. This domain emphasizes the importance of algorithms that can model the opponent quickly (before the match is over) and accurately (so that we can respond appropriately), and also incorporate game theoretic reasoning so that we can be robust in cases where our opponent was trying to mislead us. The simplicity of the domain makes it easy to write and test computer agents, making it a useful domain for students and classes.