A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment

Open in new window