Retrieving Semantically Similar Decisions under Noisy Institutional Labels: Robust Comparison of Embedding Methods