An open-source training framework to advance multimodal AI