A Model, training, and dataset details All models are trained end-to-end with the Gumbel-Softmax [

Neural Information Processing Systems 

Models are trained on a single Titan Xp GPU on an internal cluster. Training time is typically 6-8 hours on 4 CPUs and 32GB of RAM. We train with batch size B = 128 . Like ShapeWorld, RNN encoders and decoders are single layer GRUs with hidden size 1024 and embedding size 500. For additional example games from both datasets, see Figure S1.