Adjust Planning Strategies to Accommodate Reinforcement Learning Agents

Mar-18-2020–arXiv.org Artificial Intelligence

The solution of many continuous decision problem can be described as such a process: agent set out from the initial state, then go through a series of intermediate state and finally reach the goal state. Imagine an agent in a maze, which needs to find some key positions and pass through them one by one to get out. Agent has two types of behavior: one is the micro action taken at every state, which is similar to muscle activity, called reaction; another is the change of trend in reactions taken over a period of time, which is similar to thought of human, called planning [15]. For the agent in maze, reaction can be its every little moving step and planning can be its every determination of the position it should reach next. In a complicated scene with high-dimensional data stream, long-term decision process and sparse supervision signal, an agent trained only to react [9, 10] can hardly perform well (See Appendix A for demonstration). However, combining reaction and planning [3, 4, 14] can significantly improve its capability. The essence of such improvement is that agent has limited reaction capability and the introduction of planning releases agent from reacting in the whole task.

agent, planning parameter, waypoint, (13 more...)

arXiv.org Artificial Intelligence

Mar-18-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found