Reviews: Connective Cognition Network for Directional Visual Commonsense Reasoning

Neural Information Processing Systems 

Originality: The paper proposes a novel model for the recently introduced VCR task. The main novelty of the proposed model lies in the component GraphVLAD and directional GCN modules. The paper describes that one of the closest works to this work is that of Narsimhan et al., NeurIPS 2018 that used GCN to infer answers in VQA, however that work constructs an undirected graph, ignoring the directional information between the graph nodes. This paper uses directed graph instead and shows the usefulness of incorporating directional information. It would be good for this paper to include more related work on GraphVLAD front. Quality: The paper evaluates the proposed approach on the VCR dataset and compares with the baselines and previous state-of-the-art, demonstrating how the proposed work improves the previous best performance significantly.