Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions

Open in new window