Jointly Training Large Autoregressive Multimodal Models