SPOT: Scalable Policy Optimization with Trees for Markov Decision Processes

Jun-21-2026, 13:07:41 GMT–Neural Information Processing Systems

Interpretable reinforcement learning policies are essential for high-stakes decisionmaking, yet optimizing decision tree policies in Markov Decision Processes (MDPs) remains challenging. We propose SPOT, a novel method for computing decision tree policies, which formulates the optimization problem as a mixedinteger linear program (MILP). To enhance efficiency, we employ a reduced-space branch-and-bound approach that decouples the MDP dynamics from tree-structure constraints, enabling efficient parallel search. This significantly improves runtime and scalability compared to previous methods. Our approach ensures that each iteration yields the optimal decision tree. Experimental results on standard benchmarks demonstrate that SPOT achieves substantial speedup and scales to larger MDPs with a significantly higher number of states. The resulting decision tree policies are interpretable and compact, maintaining transparency without compromising performance. These results demonstrate that our approach simultaneously achieves interpretability and scalability, delivering high-quality policies an order of magnitude faster than existing approaches.

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Jun-21-2026, 13:07:41 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.67)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Search (1.00)
    - Optimization (1.00)
  - Machine Learning
    - Decision Tree Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.84)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found