Self-Supervised Representation Learning on Document Images
Cosma, Adrian, Ghidoveanu, Mihai, Panaitescu-Liess, Michael, Popescu, Marius
While previous approaches explore the effect of self-supervision on natural images, we show that patch-based pre-training performs poorly on document images because of their different structural properties and poor intra-sample semantic information. We propose two context-aware alternatives to improve performance on the Tobacco-3482 image classification task. We also propose a novel method for self-supervision, which makes use of the inherent multi-modality of documents (image and text), which performs better than other popular self-supervised methods, including supervised ImageNet pre-training, on document image classification scenarios with a limited amount of data.
May-27-2020
- Country:
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report > Promising Solution (0.34)
- Technology: