What makes math problems hard for reinforcement learning: a case study

Shehper, Ali, Medina-Mardones, Anibal M., Lewandowski, Bartłomiej, Gruen, Angus, Kucharski, Piotr, Gukov, Sergei

Aug-27-2024–arXiv.org Artificial Intelligence

Using a long-standing conjecture from combinatorial group theory, we explore, from multiple angles, the challenges of finding rare instances carrying disproportionately high rewards. Based on lessons learned in the mathematical context defined by the Andrews-Curtis conjecture, we propose algorithmic improvements that can be relevant in other domains with ultra-sparse reward problems. Although our case study can be formulated as a game, its shortest winning sequences are potentially $10^6$ or $10^9$ times longer than those encountered in chess. In the process of our study, we demonstrate that one of the potential counterexamples due to Akbulut and Kirby, whose status escaped direct mathematical methods for 39 years, is stably AC-trivial.

algorithm, présentation, sequence, (16 more...)

arXiv.org Artificial Intelligence

Aug-27-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada (0.04)
  - United States
    - New Jersey > Middlesex County
      - Piscataway (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
    - California > Los Angeles County
      - Pasadena (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Poland > Masovia Province
    - Warsaw (0.04)
- Asia > South Korea
  - Seoul > Seoul (0.04)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment > Games
  - Chess (0.48)
- Government > Regional Government
  - North America Government > United States Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Search (0.70)
  - Machine Learning
    - Reinforcement Learning (0.83)
    - Neural Networks > Deep Learning (0.45)