Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint