Supplementary Material: A Benchmark for Compositional Visual Reasoning