The Alberta Plan for AI Research
Sutton, Richard S., Bowling, Michael, Pilarski, Patrick M.
–arXiv.org Artificial Intelligence
The transition model is used to imagine possible outcomes of taking the action/option, which are then evaluated by the value functions to change the policies and the value functions themselves. This process is called planning. Planning, like everything else in the architecture, is expected to be continual and temporally uniform. On every step there will be some amount of planning, perhaps a series of small planning steps, but planning would typically not be complete in a single time step and thus would be slow compared to the speed of agent-environment interaction. Planning is an ongoing process that operates asynchronously, in the background, whenever it can be done without interfering with the first three components, all of which must operate on every time step and are said to run in the foreground.
arXiv.org Artificial Intelligence
Mar-21-2023