Vision-Language Models for Autonomous Driving: CLIP-Based Dynamic Scene Understanding

Open in new window