MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples

Open in new window