Distributional Preference Alignment of LLMs via Optimal Transport Igor Melnyk
–Neural Information Processing Systems
Ouyang et al., 2022, Bai et al., 2022], achieves this by learning a reward model on human preference
Neural Information Processing Systems
Oct-10-2025, 15:09:30 GMT
- Country:
- Asia > Middle East > Jordan (0.04)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education (0.45)
- Information Technology (0.46)
- Technology: