Evaluating Cost-Accuracy Trade-offs in Multimodal Search Relevance Judgements