TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Open in new window