CAVE: Detecting and Explaining Commonsense Anomalies in Visual Environments