oMeBench: Towards Robust Benchmarking of LLMs in Organic Mechanism Elucidation and Reasoning