Supplement to Drawing out of Distribution with Symbolic Generative Models A Details

Neural Information Processing Systems 

All dataset images are scaled to 50x50 in grayscale, with dataset-specific configuration list below. For one-shot classification ( 3.2), we use the original task-split, as found on It has 20 episodes, each a 20-way, 1-shot, within-alphabet classification task. Bézier curves are parametric curves commonly used in computer graphics to define smooth, continuous curves. As an effect of this rasterizing procedure, the pixel intensity can be arbitrarily large. B.3 Neural Network Configurations DooD and AIR in our experiments share the overall neural components.