Video as Conditional Graph Hierarchy for Multi-Granular Question Answering

Open in new window