Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation