Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models

Open in new window