Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering

Medhini Narasimhan, Svetlana Lazebnik, Alexander Schwing

Neural Information Processing Systems 

Accurately answering aquestionabout agivenimage requires combining observations with general knowledge. While this is effortless for humans, reasoning with general knowledge remains analgorithmic challenge. Toadvance research inthisdirection anovel'fact-based' visual question answering (FVQA) taskhas been introduced recently along with a large set of curated facts which link two entities, i.e., two possible answers, via a relation.