An introduction to Deep Q-Learning: let's play Doom

@machinelearnbot 

At each time step, we receive a tuple (state, action, reward, new_state). We learn from it (we feed the tuple in our neural network), and then throw this experience. Our problem is that we give sequential samples from interactions with the environment to our neural network. And it tends to forget the previous experiences as it overwrites with new experiences. For instance, if we are in the first level and then the second (which is totally different), our agent can forget how to behave in the first level.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found