6 SupplementaryMaterial

Feb-19-2026, 09:45:27 GMT–Neural Information Processing Systems

The original CLUTRR data generation framework made sure that each testproof is not in the training set in order to test whether a model is able to generalize to unseen proofs. Initial results on the original CLUTRR test sets resulted in strong model performance ( 99%) on levels seen during training (2, 4, 6) but no generalization at all ( 0%) to other levels. The models are given as input " [story] [query] " and asked to generate the proof and answer. Models are trained on levels2,4,6only. In our case, the entity names are important to evaluate systematic generalization.

artificial intelligence, lvl, machine learning, (17 more...)

Neural Information Processing Systems

Feb-19-2026, 09:45:27 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.74)

Duplicate Docs Excel Report

Title
fc84ad56f9f547eb89c72b9bac209312-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found