Depth Matters: Multimodal RGB-D Perception for Robust Autonomous Agents