Temporal difference learning and TD-Gammon
Complex board games are a natural testing ground for machine learning and artificial intelligence. They are based on experience; they are attractive; and they do not have the safety requirements that sometimes block the use of heuristic methods. Despite recent advances, computer chess seems not to be a success of machine learning as such, because of its reliance on brute force search rather than "intelligent" approaches. This paper presents an interesting example of an opposite situation, the game-learning program TD-Gammon. TD-Gammon is a neural network that trains itself to play backgammon by playing against itself and learning from the outcome.
Nov-16-2021, 16:21:08 GMT