Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application
Jr., Luiz A. Celiberto (Technological Institute of Aeronautics) | Matsuura, Jackson P. (Technological Institute of Aeronautics) | Mantaras, Ramon Lopez de (Artificial Intelligence Research Institute (IIIA-CSIC)) | Bianchi, Reinaldo A. C. (Centro Universitario da FEI)
Another way to speed up a RL algorithm is by using Transfer Learning, a paradigm of machine learning that In this paper we propose to combine three AI techniques reuses knowledge accumulated in a previous task to speed up to speed up a Reinforcement Learning algorithm the learning of a novel, but related, target task [Taylor and in a Transfer Learning problem: Casebased Stone, 2009]. Reasoning, Heuristically Accelerated Reinforcement This paper investigates the use of the Case-Based Heuristically Learning and Neural Networks. To do Accelerated Reinforcement Learning (CB-HARL) algorithm so, we propose a new algorithm, called L3, which [Bianchi et al., 2009] as a means to transfer learning works in 3 stages: in the first stage, it uses Reinforcement acquired by one agent during its training in one problem to Learning to learn how to perform one another agent that has to learn how to solve a similar, but task, and stores the optimal policy for this problem more complex, problem. To do so, we propose a new algorithm, as a case-base; in the second stage, it uses a Neural called L3, which works in 3 stages: in the first stage, Network to map actions from one domain to actions it uses the Q-learning algorithm [Watkins, 1989] to learn how in the other domain and; in the third stage, it uses to perform one task, and stores the optimal policy for this the case-base learned in the first stage as heuristics problem as a case-base; in the second stage, it uses a Neural to speed up the learning performance in a related, Network to map actions from one domain to actions in but different, task. The RL algorithm used the other domain and; in the third stage, it uses the case-base in the first phase is the Q-learning and in the third learned in the first stage as heuristics in the CB-HARL algorithm, phase is the recently proposed Case-based Heuristically speeding up the learning process.
- Country:
- South America > Brazil (0.04)
- Europe
- Spain (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report (0.86)
- Instructional Material > Course Syllabus & Notes (0.56)
- Overview (0.55)
- Industry:
- Leisure & Entertainment > Sports (0.48)
- Technology: