Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow Chen-Hao Chao 1,2 Wei-Fang Sun 2

Mar-21-2025, 18:30:38 GMT–Neural Information Processing Systems

Existing Maximum-Entropy (MaxEnt) Reinforcement Learning (RL) methods for continuous action spaces are typically formulated based on actor-critic frameworks and optimized through alternating steps of policy evaluation and policy improvement. In the policy evaluation steps, the critic is updated to capture the soft Q-function. In the policy improvement steps, the actor is adjusted in accordance with the updated soft Q-function. In this paper, we introduce a new MaxEnt RL framework modeled using Energy-Based Normalizing Flows (EBFlow).

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Mar-21-2025, 18:30:38 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Reinforcement Learning (1.00)
    - Statistical Learning > Maximum Entropy (0.61)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.45)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found