State of What Art? A Call for Multi-Prompt LLM Evaluation