Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering