Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213 Increasingly in domains with multiple intelligent agents, each agent must be able to identify what the other agents are doing. This is especially important when there are adversarial agents inferring with the accomplishment of goals. Once identified, the agents can then respond to recent strategies and adapt to improve performance. This research works under the hypothesis that fast and useful adaptation can be done by analogy to previous observations. We introduce methods to extract similarities in temporal observations of the world.
The Fourteenth Annual AAAI Mobile Robot Competition and Exhibition was held at the National Conference on Artificial Intelligence in Pittsburgh, Pennsylvania, in July 2005. This year marked a change in the venue format from a conference hall to a hotel, which changed how the robot event was run. As a result, the robots were much more visible to the attendees of the AAAI conference than in previous years. This allowed teams that focused on human-robot interaction to have many more opportunities to interact with people. This article describes the events that were held at the conference, including the Scavenger Hunt, Open Interaction, Robot Challenge, and Robot Exhibition.
The 47th annual World Series of Poker starts Tuesday at the Rio All-Suite Hotel and Casino. Among the 69 events is the tournament's Main Event, which begins July 9. It runs through July 18, when a final table of no-limit Texas Hold'Em players emerges. The final nine competitors will return to play at the Main Event championship Oct. 30 to Nov. 1. Pennsylvania poker pro Joe McKeehen won the gold bracelet last year, and a 7.68 million top prize.
Michael Bowling and Manuela Veloso Computer Science Department Carnegie Mellon University Pittsburgh PA, 15213-3891 Abstract Stochastic games are a general model of interaction between multiple agents. They have recently been the focus of a great deal of research in reinforcement learning as they are both descriptive and have a well-defined Nash equilibrium solution. Most of this recent work, although very general, has only been applied to small games with at most hundreds of states. On the other hand, there are landmark results of learning being successfully applied to specific large and complex games such as Checkers and Backgammon. In this paper we describe a scalable learning algorithm for stochastic games, that combines three separate ideas from reinforcement learning into a single algorithm. These ideas are tile coding for generalization, policy gradient ascent as the basic learning method, and our previous work on the WoLF ("Win or Learn Fast") variable learning rate to encourage convergence. We apply this algorithm to the intractably sized game-theoretic card game Goofspiel, showing preliminary results of learning in self-play. We demonstrate that policy gradient ascent can learn even in this highly non-stationary problem with simultaneous learning. We also show that the WoLF principle continues to have a converging effect even in large problems with approximation and generalization. Introduction We are interested in the problem of learning in multiagent environments. One of the main challenges with these environments is that other agents in the environment may be learning and adapting as well.