Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning