Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation

Open in new window