Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering