Goto

Collaborating Authors

 source cfg ip


A Architecture Details

Neural Information Processing Systems

We provide additional architectural details here beyond those provided in the paper. In all models, the output layer consists of the computation of logits, followed by a softmax cross-entropy categorical loss term. Figure 6 provides the grammar. Figure 6: Grammar describing the generated programs comprising the dataset in this paper. Figure 8: The same programs as in Figure 7, with a single statement masked in each.