"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives." – from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).
In this paper we build on the Optimal Rewards Framework of Singh et al. [2010] that defines the optimal intrinsic reward function as one that when used by an RL agent achieves behavior that optimizes the
To further motivate the problem, let's consider a scenario in which an agent needs to generalize to a complex novel task by performing a composition of subtasks where the task description and
These are achieved by guiding the randomized greedy algorithm with a fast local search algorithm. Further, we develop deterministic versions of these algorithms, maintaining the same ratio and asymptotic time complexity.