pgx
Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning
We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging JAX's auto-vectorization and parallelization over accelerators, Pgx can efficiently scale to thousands of simultaneous simulations over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing implementations available in Python. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles. We demonstrate the efficient training of the Gumbel AlphaZero algorithm with Pgx environments. Overall, Pgx provides high-performance environment simulators for researchers to accelerate their RL experiments. Pgx is available at https://github.com/sotetsuk/pgx.
- Europe > Italy (0.04)
- North America > United States > Texas (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Overview (0.46)
- Research Report > New Finding (0.46)
- Leisure & Entertainment > Games > Chess (0.50)
- Leisure & Entertainment > Games > Backgammon (0.47)
- Leisure & Entertainment > Games > Go (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Games (1.00)
- Europe > Italy (0.04)
- North America > United States > Texas (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Overview (0.46)
- Research Report > New Finding (0.46)
- Leisure & Entertainment > Games > Chess (0.50)
- Leisure & Entertainment > Games > Backgammon (0.47)
- Leisure & Entertainment > Games > Go (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning
We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging JAX's auto-vectorization and parallelization over accelerators, Pgx can efficiently scale to thousands of simultaneous simulations over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing implementations available in Python. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles.
- Leisure & Entertainment > Games > Chess (0.30)
- Leisure & Entertainment > Games > Backgammon (0.30)
Preference-Conditioned Gradient Variations for Multi-Objective Quality-Diversity
Janmohamed, Hannah, Faldor, Maxence, Pierrot, Thomas, Cully, Antoine
In a variety of domains, from robotics to finance, Quality-Diversity algorithms have been used to generate collections of both diverse and high-performing solutions. Multi-Objective Quality-Diversity algorithms have emerged as a promising approach for applying these methods to complex, multi-objective problems. However, existing methods are limited by their search capabilities. For example, Multi-Objective Map-Elites depends on random genetic variations which struggle in high-dimensional search spaces. Despite efforts to enhance search efficiency with gradient-based mutation operators, existing approaches consider updating solutions to improve on each objective separately rather than achieving desired trade-offs. In this work, we address this limitation by introducing Multi-Objective Map-Elites with Preference-Conditioned Policy-Gradient and Crowding Mechanisms: a new Multi-Objective Quality-Diversity algorithm that uses preference-conditioned policy-gradient mutations to efficiently discover promising regions of the objective space and crowding mechanisms to promote a uniform distribution of solutions on the Pareto front. We evaluate our approach on six robotics locomotion tasks and show that our method outperforms or matches all state-of-the-art Multi-Objective Quality-Diversity methods in all six, including two newly proposed tri-objective tasks. Importantly, our method also achieves a smoother set of trade-offs, as measured by newly-proposed sparsity-based metrics. This performance comes at a lower computational storage cost compared to previous methods.
- Research Report > Promising Solution (0.66)
- Research Report > Experimental Study (0.46)
- Leisure & Entertainment > Games (0.46)
- Energy > Oil & Gas (0.46)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning
Koyamada, Sotetsu, Okano, Shinri, Nishimori, Soichiro, Murata, Yu, Habara, Keigo, Kita, Haruka, Ishii, Shin
We propose Pgx, a suite of board game reinforcement learning (RL) environments written in JAX and optimized for GPU/TPU accelerators. By leveraging JAX's auto-vectorization and parallelization over accelerators, Pgx can efficiently scale to thousands of simultaneous simulations over accelerators. In our experiments on a DGX-A100 workstation, we discovered that Pgx can simulate RL environments 10-100x faster than existing implementations available in Python. Pgx includes RL environments commonly used as benchmarks in RL research, such as backgammon, chess, shogi, and Go. Additionally, Pgx offers miniature game sets and baseline models to facilitate rapid research cycles. We demonstrate the efficient training of the Gumbel AlphaZero algorithm with Pgx environments. Overall, Pgx provides high-performance environment simulators for researchers to accelerate their RL experiments. Pgx is available at https://github.com/sotetsuk/pgx.
- Europe > Italy (0.04)
- North America > United States > Texas (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Leisure & Entertainment > Games > Backgammon (0.67)
- Leisure & Entertainment > Games > Chess (0.52)
- Leisure & Entertainment > Games > Go (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Games (1.00)