Hierarchical Visual Feature Aggregation for OCR-Free Document Understanding

Neural Information Processing Systems 

We present a novel OCR-free document understanding framework based on pre-trained Multimodal Large Language Models (MLLMs).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found