Using Options to Accelerate Learning of New Tasks According to Human Preferences
Bonini, Rodrigo Cesar (Universidade de Sao Paulo) | Silva, Felipe Leno da (Universidade de São Paulo) | Spina, Edison (Universidade de São Paulo) | Costa, Anna Helena Reali (Universidade de São Paulo)
Over the years, people need to incorporate a wider range of information and multiple objectives for their decision making. Nowadays, humans are dependent on computer systems to interpret and take profit from the huge amount of available data on the Internet. Hence, varied services, such as location-based systems, must combine a huge quantity of raw data to give the desired response to the user. However, as humans have different preferences, the optimal answer is different for each user profile, and few systems offer the service of solving tasks in a customized manner for each user. Reinforcement Learning (RL) has been used to autonomously train systems to solve (or assist on) decision-making tasks according to user preferences. However, the learning process is very slow and require many interactions with the environment. Therefore, we here propose to reuse knowledge from previous tasks to accelerate the learning process in a new task. Our proposal, called Multiobjective Options, accelerates learning while providing a customized solution according to the current user preferences. Our experiments in the Tourist World Domain show that our proposal learns faster and better than regular learning, and that the achieved solutions follow user preferences.
Feb-4-2017
- Country:
- North America > United States > Massachusetts (0.28)
- Industry:
- Transportation (0.46)
- Technology: