Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning

Open in new window