Intriguing Properties of Large Language and Vision Models