Goto

Collaborating Authors

 Energy


Catastrophic Goodhart: regularizing RLHF with KL divergence does not mitigate heavy-tailed reward misspecification

Neural Information Processing Systems

However, if error is heavy-tailed, some policies obtain arbitrarily high reward despite achieving no more utility than the base model-a phenomenon we call catastrophic Goodhart. We adapt a discrete optimization method to measure the tails of reward models, finding that they are consistent with light-tailed error.



The world's smallest sea turtle lives in a noisy ocean

Popular Science

Noisy ships and industry are impacting critically endangered Kemp's ridley sea turtles. Breakthroughs, discoveries, and DIY tips sent six days a week. For the world's smallest sea turtles, life in the ocean is getting pretty noisy. These relatively little turtles (on average they're still 75 to 100 pounds) mostly found in the Gulf of Mexico already face fishing gear accidents, seacraft collisions, plastic pollution, and habitat deterioration, and now excess noise may be harming the critically endangered and rare Kemp's ridley sea turtles (). We say because even though these sea turtles share waters with extremely busy shipping lanes, scientists know very little about their underwater hearing.







Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization

Neural Information Processing Systems

Bayesian optimization (BO) conventionally relies on handcrafted acquisition functions (AFs) to sequentially determine the sample points. However, it has been widely observed in practice that the best-performing AF in terms of regret can vary significantly under different types of black-box functions. It has remained a challenge to design one AF that can attain the best performance over a wide variety of black-box functions.