AITopics

Country: North America > Canada > Alberta (0.14)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

arXiv.org Machine LearningDec-20-2011

Alignment Based Kernel Learning with a Continuous Set of Base Kernels

Afkanpour, Arash, Szepesvari, Csaba, Bowling, Michael

The success of kernel-based learning methods depend on the choice of kernel. Recently, kernel learning methods have been proposed that use data to select the most appropriate kernel, usually by combining a set of base kernels. We introduce a new algorithm for kernel learning that combines a {\em continuous set of base kernels}, without the common step of discretizing the space of base kernels. We demonstrate that our new method achieves state-of-the-art performance across a variety of real-world datasets. Furthermore, we explicitly demonstrate the importance of combining the right dictionary of kernels, which is problematic for methods based on a finite set of base kernels chosen a priori. Our method is not the first approach to work with continuously parameterized kernels. However, we show that our method requires substantially less computation than previous such approaches, and so is more amenable to multiple dimensional parameterizations of base kernels, which we demonstrate.

artificial intelligence, kernel, optimization problem, (19 more...)

arXiv.org Machine Learning

1112.4607

Country: North America > Canada > Alberta (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-31-2009

Monte Carlo Sampling for Regret Minimization in Extensive Games

Lanctot, Marc, Waugh, Kevin, Zinkevich, Martin, Bowling, Michael

Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation. Then, we introduce two sampling schemes: {\it outcome sampling} and {\it external sampling}, showing that both have bounded overall regret with high probability. Thus, they can compute an approximate equilibrium using self-play. Finally, we prove a new tighter bound on the regret for the original CFR algorithm and relate this new bound to MCCFRs bounds. We show empirically that, although the sample-based algorithms require more iterations, their lower cost per iteration can lead to dramatically faster convergence in various games.

artificial intelligence, game theory, iteration, (19 more...)

Country:

North America > Canada > Alberta (0.29)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsDec-31-2009

Strategy Grafting in Extensive Games

Waugh, Kevin, Bard, Nolan, Bowling, Michael

Extensive games are often used to model the interactions of multiple agents within an environment. Much recent work has focused on increasing the size of an extensive game that can be feasibly solved. Despite these improvements, many interesting games are still too large for such techniques. A common approach for computing strategies in these large games is to first employ an abstraction technique to reduce the original game to an abstract game that is of a manageable size. This abstract game is then solved and the resulting strategy is used in the original game. Most top programs in recent AAAI Computer Poker Competitions use this approach. The trend in this competition has been that strategies found in larger abstract games tend to beat strategies found in smaller abstract games. These larger abstract games have more expressive strategy spaces and therefore contain better strategies. In this paper we present a new method for computing strategies in large games. This method allows us to compute more expressive strategies without increasing the size of abstract games that we are required to solve. We demonstrate the power of the approach experimentally in both small and large games, while also providing a theoretical justification for the resulting improvement.

artificial intelligence, base strategy, game theory, (18 more...)

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Poker (0.51)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Games > Poker (0.72)

Stable Dual Dynamic Programming

Wang, Tao, Bowling, Michael, Schuurmans, Dale, Lizotte, Daniel J.

Recently, we have introduced a novel approach to dynamic programming and reinforcement learning that is based on maintaining explicit representations of stationary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

artificial intelligence, operator, reinforcement learning, (18 more...)

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Stable Dual Dynamic Programming

Wang, Tao, Bowling, Michael, Schuurmans, Dale, Lizotte, Daniel J.

Recently, we have introduced a novel approach to dynamic programming and reinforcement learningthat is based on maintaining explicit representations of stationary distributions instead of value functions. In this paper, we investigate the convergence properties of these dual algorithms both theoretically and empirically, and show how they can be scaled up by incorporating function approximation.

artificial intelligence, operator, optimization problem, (19 more...)

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.73)

Computing Robust Counter-Strategies

Johanson, Michael, Zinkevich, Martin, Bowling, Michael

Adaptation to other initially unknown agents often requires computing an effective counter-strategy. In the Bayesian paradigm, one must find a good counter-strategy to the inferred posterior of the other agents' behavior. In the experts paradigm, one may want to choose experts that are good counter-strategies to the other agents' expected behavior. In this paper we introduce a technique for computing robust counter-strategies for adaptation in multiagent scenarios under a variety of paradigms. The strategies can take advantage of a suspected tendency in the decisions of the other agents, while bounding the worst-case performance when the tendency is not observed. The technique involves solving a modified game, and therefore can make use of recently developed algorithms for solving very large extensive games. We demonstrate the effectiveness of the technique in two-player Texas Hold'em. We show that the computed poker strategies are substantially more robust than best response counter-strategies, while still exploiting a suspected tendency. We also compose the generated strategies in an experts algorithm showing a dramatic improvement in performance over using simple best responses.

artificial intelligence, game theory, opponent, (19 more...)

Country:

North America > Canada > Alberta (0.29)
North America > United States > Texas (0.26)

Industry: Leisure & Entertainment > Games > Poker (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Games > Poker (0.70)

Regret Minimization in Games with Incomplete Information

Zinkevich, Martin, Johanson, Michael, Bowling, Michael, Piccione, Carmelo

Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for solving large games based on regret minimization. In particular, we introduce the notion of counterfactual regret, which exploits the degree of incomplete information in an extensive game. We show how minimizing counterfactual regret minimizes overall regret, and therefore in self-play can be used to compute a Nash equilibrium. We demonstrate this technique in the domain of poker, showing we can solve abstractions of limit Texas Hold'em with as many as 10

artificial intelligence, game theory, information, (20 more...)

Country:

North America > Canada > Alberta (0.29)
North America > United States > Texas (0.25)

Industry: Leisure & Entertainment > Games > Poker (0.91)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Games > Poker (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Neural Information Processing SystemsDec-31-2007

iLSTD: Eligibility Traces and Convergence Analysis

Geramifard, Alborz, Bowling, Michael, Zinkevich, Martin, Sutton, Richard S.

In this paper, we generalize the previous iLSTD algorithm and present three new results: (1)the first convergence proof for an iLSTD algorithm; (2) an extension to incorporate eligibility traces without changing the asymptotic computational complexity; and(3) the first empirical results with an iLSTD algorithm for a problem (mountain car) with feature vectors large enough (n 10, 000) to show substantial computationaladvantages over LSTD.

artificial intelligence, ilstd, reinforcement learning, (16 more...)

Country:

North America > Canada > Alberta (0.29)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsDec-31-2006

Online Discovery and Learning of Predictive State Representations

Mccracken, Peter, Bowling, Michael

Predictive state representations (PSRs) are a method of modeling dynamical systemsusing only observable data, such as actions and observations, to describe their model. PSRs use predictions about the outcome of future teststo summarize the system state.

algorithm, artificial intelligence, machine learning, (16 more...)

Country: North America > Canada > Alberta (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)