Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models

Open in new window