A Puzzle-Based Dataset for Natural Language Inference