The in-context inductive biases of vision-language models differ across modalities