TRINS: Towards Multimodal Language Models that Can Read

Open in new window