Belief-Dependent Macro-Action Discovery in POMDPs using the Value of Information Genevieve Flaspohler 1,2, Nicholas Roy 1, and John W. Fisher III 1 Massachusetts Intitute of Technology

Neural Information Processing Systems 

This supplement includes additional details, figures, and analysis not presented in the main text due to space limitations. Section A presents algorithm descriptions for macro-action generation, including modified value iteration and macro-action chaining. Section B includes detailed derivations of Lemma 5.3, 5.4 and 5.5 presented in the main text. Section C presents the details of macro-action generation in the case of discrete POMDPs with α-vector value function representations and discusses the algorithmic complexity of macro-action generation in the discrete case. Finally, Section D provides additional visualizations and discussion of experimental results.