Multimodal Search on Iconclass using Vision-Language Pre-Trained Models