Geometric Re-Analysis of Classical MDP Solving Algorithms

Mustafin, Arsenii, Pakharev, Aleksei, Olshevsky, Alex, Paschalidis, Ioannis Ch.

Mar-6-2025–arXiv.org Artificial Intelligence

We extend a recently introduced geometric interpretation of Markov Decision Processes (MDPs) that provides a new perspective on MDP algorithms and their dynamics. Based on this view, we develop a novel analytical framework that simplifies the proofs of existing results and enables us to derive new ones. Specifically, we analyze the behavior of two classical MDP-solving algorithms: Policy Iteration (PI) and Value Iteration (VI). For each algorithm, we first describe its dynamics in geometric terms and then present an analysis along with several convergence results. We begin by introducing an MDP transformation that modifies the discount factor γ and demonstrate how this transformation improves the convergence properties of both algorithms, provided that it can be applied such that the resulting system remains a regular MDP. Second, we present a new analysis of PI in a 2-state MDP case, showing that the number of iterations required for convergence is bounded by the number of state-action pairs. Finally, we reveal an additional convergence factor in the VI algorithm for cases with a connected optimal policy, which is attributed to an extra rotation component in the VI dynamics.

algorithm, iteration, optimal policy, (16 more...)

arXiv.org Artificial Intelligence

Mar-6-2025

arXiv.org PDF

Add feedback

Country:
- Europe > France > Nouvelle-Aquitaine > Gironde > Bordeaux (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found