Learning to Play Text-based Adventure Games with Maximum Entropy Reinforcement Learning
Li, Weichen, Devidze, Rati, Fellenz, Sophie
–arXiv.org Artificial Intelligence
Text-based games are a popular testbed for language-based reinforcement learning (RL). In previous work, deep Q-learning is commonly used as the learning agent. Q-learning algorithms are challenging to apply to complex real-world domains due to, for example, their instability in training. Therefore, in this paper, we adapt the soft-actor-critic (SAC) algorithm to the text-based environment. To deal with sparse extrinsic rewards from the environment, we combine it with a potential-based reward shaping technique to provide more informative (dense) reward signals to the RL agent. We apply our method to play difficult text-based games. The SAC method achieves higher scores than the Q-learning methods on many games with only half the number of training steps. This shows that it is well-suited for text-based games. Moreover, we show that the reward shaping technique helps the agent to learn the policy faster and achieve higher scores. In particular, we consider a dynamically learned value function as a potential function for shaping the learner's original sparse reward signals.
arXiv.org Artificial Intelligence
Jun-27-2023
- Country:
- Asia > Middle East
- Israel > Haifa District
- Haifa (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Israel > Haifa District
- Europe
- Germany
- Berlin (0.04)
- Rhineland-Palatinate > Kaiserslautern (0.04)
- Saarland > Saarbrücken (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Slovenia > Upper Carniola
- Municipality of Bled > Bled (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Germany
- North America > United States
- Texas > Travis County > Austin (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.86)