ae3539867aaeec609a4260c6feb725f4-AuthorFeedback.pdf

Neural Information Processing Systems 

Thank you all for your thoughtful comments; we address your concerns below. The MDL principle formalizes Occam's razor and is a We will add the discussion of such relevant studies to section 1. TLT values for GQA are much lower than that for CLEVR (over 4 for MAC(8)). We will add these results and accompanying visualizations to appendix. We found that during evaluation, rk4 solves all the dynamics generated from CLEVR dataset. We will add these results to section 5.2.