eppo
The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Miao, Yuchun, Zhang, Sen, Ding, Liang, Zhang, Yuqi, Zhang, Lefei, Tao, Dacheng
This work identifies the Energy Loss Phenomenon in Reinforcement Learning from Human Feedback (RLHF) and its connection to reward hacking. Specifically, energy loss in the final layer of a Large Language Model (LLM) gradually increases during the RL process, with an excessive increase in energy loss characterizing reward hacking. Beyond empirical analysis, we further provide a theoretical foundation by proving that, under mild conditions, the increased energy loss reduces the upper bound of contextual relevance in LLMs, which is a critical aspect of reward hacking as the reduced contextual relevance typically indicates overfitting to reward model-favored patterns in RL. To address this issue, we propose an Energy loss-aware PPO algorithm (EPPO) which penalizes the increase in energy loss in the LLM's final layer during reward calculation to prevent excessive energy loss, thereby mitigating reward hacking. We theoretically show that EPPO can be conceptually interpreted as an entropy-regularized RL algorithm, which provides deeper insights into its effectiveness. Extensive experiments across various LLMs and tasks demonstrate the commonality of the energy loss phenomenon, as well as the effectiveness of EPPO in mitigating reward hacking and improving RLHF performance.
Evolutionary Pre-Prompt Optimization for Mathematical Reasoning
Videau, Mathurin, Leite, Alessandro, Schoenauer, Marc, Teytaud, Olivier
However, despite their size and complexity, these models still face challenges in multi-step reasoning, particularly in tasks that require arithmetic, logic, and/or mathematical reasoning [Cobbe et al. 2021; Rae et al. 2021]. To address this limitation, recent works have focused on enhancing the reasoning abilities of LLMs. A significant advancement in this direction is the chain-of-thought (CoT) prompting method [Wei et al. 2022b]. This approach involves guiding LLMs to articulate intermediate reasoning steps in a manner akin to human thought processes, leading to more accurate and interpretable solutions. This method has shown substantial improvements on complex tasks, including mathematics and commonsense reasoning [Lu et al. 2022b; Suzgun et al. 2022; Wei et al. 2022b]. The advancement of the CoT prompting has opened new pathways in the design of effective CoT prompts [Fu et al. 2022; Jiang et al. 2023; Kojima et al. 2022; Zhou et al. 2022].
Croatian PM sacks health minister accused of corruption
Croatia's prime minister has fired Health Minister Vili Beros following his arrest on suspicion of corruption as part of a European Union investigation. "This morning, former Minister Vili Beros and two other individuals were arrested as part of an operation conducted" by anticorruption officials, Prime Minister Andrej Plenkovic told a news conference on Friday. "As prime minister, I am personally appalled by the idea that anyone in the healthcare system would use their position either for personal enrichment or to favour someone else within the healthcare system," Plenkovic said. The European Public Prosecutor's Office (EPPO) in the capital, Zagreb, said it had launched an investigation into eight people, including Beros and the directors of two hospitals. The EU's independent public prosecution office accused the suspects, and two companies, of "accepting and giving bribes, abuse of position and authority and money laundering", it said in a statement.
Modern Experimentation Platforms
Che Sharma is the founder and CEO of Eppo, an experimentation framework that integrates with modern data platforms (cloud lakehouses and cloud data warehouses). We discuss the importance of investing in experimentation tools and the power of having a well-oiled experimentation culture within an organization. Che also explains how modern data platforms enable a variety of applications, including experimentation frameworks like Eppo. Modern data platforms have spawned a growing ecosystem of data engineering tools for areas such as data quality, data discovery and governance, data integration and much more. Meanwhile, low-code and no-code tools are making BI, analytics, and machine learning accessible to a broader range of users.