Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model with Proxy

Open in new window