ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation

Open in new window