AGQA: A Benchmark for Compositional, Spatio-Temporal Reasoning

Jun-21-2021, 22:03:10 GMT–#artificialintelligence

Take a look at the video above and the associated question – What did they hold before opening the closet?. After looking at the video, you can easily answer that the person is holding a phone. People have a remarkable ability to comprehend visual events in new videos and to answer questions about that video. For instance, the person initially holds a phone and then opens the closet and takes out a picture. To answer this question, we need to recognize the action "opening the closet" and then understand how "before" should restrict our search for the answer to events before this action.

accuracy, composition, scene graph, (15 more...)

#artificialintelligence

Jun-21-2021, 22:03:10 GMT

News Web Page

Add feedback

Country:
- North America > United States > California > Santa Clara County > Palo Alto (0.40)

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.40)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found