Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models