A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes

Open in new window