A3C -- What It Is & What I Built
The basic actor-critic model stems from Deep Convolution Q-Learning which is where the agent implements q-learning, but instead of taking in a matrix of states as input, it takes in images and feeds them into a deep convolutional neural network. Don't worry about the rectangles on the right side, they represent a deep neural network with all the nodes and connections. It's just easier to explain and understand A3C this way. In a regular Deep Convolution Q-Learning network, there would only be one output and that would be the q-values of the different actions. However in A3C, there are two outputs, one of the q-values for the different actions and the other to calculate the value of being in the state the agent is actually in.
Nov-10-2019, 16:10:31 GMT
- Technology: