On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval