A Study on Learning Social Robot Navigation with Multimodal Perception