Discovering Preference Optimization Algorithms with and for Large Language Models Chris Lu
–Neural Information Processing Systems
Typically, preference optimization is approached as an offline supervised learning task using manually crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under-explored.
Neural Information Processing Systems
Nov-19-2025, 23:17:44 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- England
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- Massachusetts > Hampshire County
- Amherst (0.04)
- Illinois > Cook County
- South America > Chile
- Asia > Middle East
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Media (0.68)
- Technology: