Understanding and Alleviating Memory Consumption in RLHF for LLMs