AITopics | learning reusable option

Collaborating Authors

learning reusable option

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Reusable Options for Multi-Task Reinforcement Learning

#artificialintelligenceJan-9-2020, 12:36:56 GMT

The option-critic architecture [2] is a more direct approach that learns options and a policy over options simultaneously. The option policies and their termination functions are trained using policy gradient methods, while the policy over options may be trained using any technique. One issue that often arises within this framework is that the termination functions of the learned options tend to collapse to "always terminate". In a later publication, the authors built on this work to consider the case where there is a cost associated with switching options [6]. This method resulted in the agent learning to use a single option while it was appropriate and terminate when an option switch was needed, allowing it to discover improved policies for a particular task.

learning reusable option, multi-task reinforcement learning, termination function, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Learning Reusable Options for Multi-Task Reinforcement Learning

Garcia, Francisco M., Nota, Chris, Thomas, Philip S.

arXiv.org Artificial IntelligenceJan-6-2020

One of the main reasons why RL has worked so well in these applications is that we are able simulate millions of interactions with the environment in a relatively short period of time, allowing the agent to experience a large number of different situations in the environment and learn the consequences of its actions. In many real world applications, however, where the agent interacts with the physical world, it might not be easy to generate such a large number of interactions. The time and cost associated with training such systems could render RL an unfeasible approach for training in large scale. As a concrete example, consider training a large number of humanoid robots (agents) to move quickly, as in the Robocup competition [ Farchy et al., 2013 ] . Although the agents have similar dynamics, subtle variations mean that a single policy shared across all agents would not be an effective solution.

agent, probability, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2001.01577

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback