Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks
Zhang, Ruohan, Torabi, Faraz, Warnell, Garrett, Stone, Peter
–arXiv.org Artificial Intelligence
With respect to artificial learning agents in particular, humans must provide some specification of what the agent should learn to perform. One method by which humans typically provide this specification is by designing a stationary reward function. This function provides a reward to the agent when it correctly performs the desired task and, perhaps, punishment when the agent does not. Artificial learning agents may then approach the task-learning process using reinforcement learning (RL) techniques (Sutton and Barto, 2018) that seek to find a policy (i.e., an explicit function that the agent uses to make decisions) that allows the agent to gather as much reward as possible. Another popular way in which humans specify tasks for artificial agents to learn is by demonstrating the task themselves. Typically, this is accomplished by having the human perform the task while the learning agent observes the actions that the human takes (e.g., the human physically moving a robot arm). In these cases, artificial agents may use approaches from imitation learning (IL) (Schaal, 1999; Argall et al., 2009; Osa et al., 2018) in order to find policies that allow them to perform the demonstrated task.
arXiv.org Artificial Intelligence
Jul-12-2021
- Country:
- North America > United States > Texas (0.14)
- Genre:
- Overview (1.00)
- Industry:
- Education > Educational Setting
- Online (0.67)
- Health & Medicine (0.67)
- Leisure & Entertainment > Games
- Computer Games (1.00)
- Transportation (0.67)
- Education > Educational Setting
- Technology: