LIRE: listwise reward enhancement for preference alignment