PaLI-X: On Scaling up a Multilingual Vision and Language Model