Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation