Multi-Reference Preference Optimization for Large Language Models