AITopics | pipeline psro

Collaborating Authors

pipeline psro

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Neural Information Processing SystemsDec-24-2025, 20:13:51 GMT

Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable PSRO-based method for finding approximate Nash equilibria in large zero-sum imperfect-information games. P2SRO is able to parallelize PSRO with convergence guarantees by maintaining a hierarchical pipeline of reinforcement learning workers, each training against the policies generated by lower levels in the hierarchy. We show that unlike existing methods, P2SRO converges to an approximate Nash equilibrium, and does so faster as the number of parallel workers increases, across a variety of imperfect information games. We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of 10^50. P2SRO is able to achieve state-of-the-art performance on Barrage Stratego and beats all existing bots.

approximate nash equilibria, pipeline psro, scalable approach, (7 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Review for NeurIPS paper: Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Neural Information Processing SystemsFeb-7-2025, 17:30:39 GMT

Weaknesses: The paper is missing a comparison with the most relevant previous work, namely XFP [1] Heinrich, Johannes, and David Silver. Both of these works are mentioned in the Background and Related Work, but: 1) XFP is just mentioned but never compared to in experiments 2)DeepCFR is just discarded with "However, Deep CFR uses external sampling, which may be impractical for games with a large branching factor such as Stratego and Barrage Stratego." Furthermore, there are newer variants based on this work, and it is not limited to a particular form of sampling. The paper only really compares to other variants from the PSRO family Furthermore, the theory and algorithms (the way described in the text) deal only with matrix games, while the experiments are on extensive form games. If the goal is to run on top of the exponentially large matrix game, this should be discussed.

approximate nash equilibria, neurips paper, scalable approach, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Neural Information Processing SystemsJan-13-2025, 20:16:19 GMT

approximate nash equilibria, pipeline psro, scalable approach, (6 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback