linear function approximation
A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes
The proximal policy optimization (PPO) algorithm stands as one of the most prosperous methods in the field of reinforcement learning (RL). Despite its success, the theoretical understanding of PPO remains deficient. Specifically, it is unclear whether PPO or its optimistic variants can effectively solve linear Markov decision processes (MDPs), which are arguably the simplest models in RL with function approximation.
Country:
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)
Country:
- Asia > Middle East > Jordan (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
Technology:
Country:
- Oceania > Australia > New South Wales > Sydney (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (22 more...)
Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Country:
- Oceania > Australia > New South Wales > Sydney (0.14)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (22 more...)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.83)
Country:
- North America > Canada > Alberta (0.14)
- Asia > Middle East > Jordan (0.04)
Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Technology:
Country:
- Oceania > New Zealand (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
Industry:
Technology:
Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
- Asia > Middle East > Jordan (0.04)
Technology:
Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
- (2 more...)
Technology:
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)