A Partially Supervised Reinforcement Learning Framework for Visual Active 440 Search: Supplementary Material 441 A Policy Network Architecture and Hyperparameter Details