Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective

Open in new window