sthanikamsanthosh
Policy Gradient(Reinforce)using Tensorflow2
In this article, we will be discussing what is Policy gradients and how to implement policy gradients using tensorflow2. There are three main points in the policy gradient algorithm. By considering the above three principles, we can implement the policy gradient using TensorFlow. We are dividing our source code into two parts. Policy gradient takes the current state as input and outputs probabilities for all actions.
Dueling Double Deep Q Learning with Tensorflow
In this article, we will be going through what is Dueling Double Deep Q Learning and how to implement it in Tenroflow. Dueling Double Deep Q learning is the combination of Dueling Deep Q Learning and Double Deep Q Learning. Let's try to understand what is Dueling Deep Q learning and Double Deep Q Learning. One of the drawbacks of the DQN algorithm is that it overestimates the true rewards; the Q-values think the agent is going to obtain a higher return than what it will obtain in reality. This overestimation is due to the presence of Max of Q value for the next state in the Q learning update equation.