On the hardness of RL with Lookahead

Pla, Corentin, Richard, Hugo, Abeille, Marc, Merlis, Nadav, Perchet, Vianney

Oct-23-2025–arXiv.org Machine Learning

We study reinforcement learning (RL) with transition look-ahead, where the agent may observe which states would be visited upon playing any sequence of $\ell$ actions before deciding its course of action. While such predictive information can drastically improve the achievable performance, we show that using this information optimally comes at a potentially prohibitive computational cost. Specifically, we prove that optimal planning with one-step look-ahead ($\ell=1$) can be solved in polynomial time through a novel linear programming formulation. In contrast, for $\ell \geq 2$, the problem becomes NP-hard. Our results delineate a precise boundary between tractable and intractable cases for the problem of planning with transition look-ahead in reinforcement learning.

machine learning, reinforcement learning, transition look-ahead, (18 more...)

arXiv.org Machine Learning

Oct-23-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - New Jersey > Hudson County
    - Hoboken (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Europe > France
  - Île-de-France > Paris > Paris (0.04)
- Asia > Middle East
  - Jordan (0.04)
  - Israel > Haifa District
    - Haifa (0.04)

Genre:
- Research Report > New Finding (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Mathematical & Statistical Methods (0.68)
    - Agents (0.67)
    - Optimization (0.66)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found