A Model, training, and dataset details All models are trained end-to-end with the Gumbel-Softmax [

Aug-16-2025, 04:00:44 GMT–Neural Information Processing Systems

Models are trained on a single Titan Xp GPU on an internal cluster. Training time is typically 6-8 hours on 4 CPUs and 32GB of RAM. We train with batch size B = 128 . Like ShapeWorld, RNN encoders and decoders are single layer GRUs with hidden size 1024 and embedding size 500. For additional example games from both datasets, see Figure S1.

artificial intelligence, machine learning, reference game, (18 more...)

Neural Information Processing Systems

Aug-16-2025, 04:00:44 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.47)
  - Statistical Learning (0.41)

Duplicate Docs Excel Report

Title
9597353e41e6957b5e7aa79214fcb256-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found