Better Exploration with Parameter Noise

#artificialintelligence 

Parameter noise helps algorithms more efficiently explore the range of actions available to solve an environment. After 216 episodes of training DDPG without parameter noise will frequently develop inefficient running behaviors, whereas policies trained with parameter noise often develop a high-scoring gallop. Parameter noise lets us teach agents tasks much more rapidly than with other approaches. After learning for 20 episodes on the HalfCheetah Gym environment (shown above), the policy achieves a score of around 3,000, whereas a policy trained with traditional action noise only achieves around 1,500. Parameter noise adds adaptive noise to the parameters of the neural network policy, rather than to its action space. Traditional RL uses action space noise to change the likelihoods associated with each action the agent might take from one moment to the next.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found