Behind DeepMind's Framework That Discovers New RL Algorithms
DeepMind recently introduced a new meta-learning approach that generates a reinforcement learning algorithm known as Learned Policy Gradient (LPG). According to the researchers, automating the discovery of update rules from data could lead to more efficient algorithms that could also be better adapted to specific environments. That one technique of machine learning which can be compared with the psychological behaviour of animals is reinforcement learning. The objective of reinforcement learning is to maximise the expected cumulative rewards or average rewards. This algorithm has gained much traction by researchers and developers over the past few years.
Jul-22-2020, 15:07:08 GMT
- Technology: