Cross-modal Causal Relation Alignment for Video Question Grounding

Open in new window