6 SupplementaryMaterial

Neural Information Processing Systems 

The original CLUTRR data generation framework made sure that each testproof is not in the training set in order to test whether a model is able to generalize to unseen proofs. Initial results on the original CLUTRR test sets resulted in strong model performance ( 99%) on levels seen during training (2, 4, 6) but no generalization at all ( 0%) to other levels. The models are given as input " [story] [query] " and asked to generate the proof and answer. Models are trained on levels2,4,6only. In our case, the entity names are important to evaluate systematic generalization.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found