Language Models for Semantic Extraction and Filtering in Video Action Recognition

Open in new window