Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity

May-28-2025, 11:42:53 GMT–Neural Information Processing Systems

Reinforcement Learning (RL) encompasses diverse paradigms, including modelbased RL, policy-based RL, and value-based RL, each tailored to approximate the model, optimal policy, and optimal value function, respectively. This work investigates the potential hierarchy of representation complexity among these RL paradigms. By utilizing computational complexity measures, including time complexity and circuit complexity, we theoretically unveil a potential representation complexity hierarchy within RL. We find that representing the model emerges as the easiest task, followed by the optimal policy, while representing the optimal value function presents the most intricate challenge. Additionally, we reaffirm this hierarchy from the perspective of the expressiveness of Multi-Layer Perceptrons (MLPs), which align more closely with practical deep RL and contribute to a completely new perspective in theoretical studying representation complexity in RL. Finally, we conduct deep RL experiments to validate our theoretical findings.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

May-28-2025, 11:42:53 GMT

Conferences PDF

Add feedback

Genre:
- Research Report
  - Experimental Study (0.92)
  - New Finding (0.93)

Industry:
- Information Technology (0.45)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.68)
  - Neural Networks
    - Deep Learning (0.46)
    - Perceptrons (0.54)
  - Reinforcement Learning (1.00)