An experimental design perspective on model-based reinforcement learning

May-19-2022, 14:05:12 GMT–AIHub

We evaluate BARL on the TQRL setting in 5 environments which span a variety of reward function types, dimensionalities, and amounts of required data. In this evaluation, we estimate the minimum amount of data an algorithm needs to learn a controller. The evaluation environments include the standard underactuated pendulum swing-up task, a cartpole swing-up task, the standard 2-DOF reacher task, a navigation problem where the agent must find a path across pools of lava, and a simulated nuclear fusion control problem where the agent is tasked with modulating the power injected into the plasma to achieve a target pressure. To assess the performance of BARL in solving MDPs quickly, we assembled a group of reinforcement learning algorithms that represent the state of the art in solving continuous MDPs. We compare against model-based algorithms PILCO [7], PETS [2], model-predictive control with a GP (MPC), and uncertainty sampling with a GP (), as well as model-free algorithms SAC [3], TD3 [8], and PPO [9].

artificial intelligence, machine learning, reinforcement learning, (15 more...)

AIHub

May-19-2022, 14:05:12 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.41)

Industry:
- Energy > Oil & Gas (0.55)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found