Goto

Collaborating Authors

 Optimization


4b3cc0d1c897ebcf71aca92a4a26ac83-Paper-Conference.pdf

Neural Information Processing Systems

More specifically,for the output features ofthe penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also aligned withthelastlayer'sclassifier.


Catastrophic Goodhart: regularizing RLHF with KL divergence does not mitigate heavy-tailed reward misspecification

Neural Information Processing Systems

However, if error is heavy-tailed, some policies obtain arbitrarily high reward despite achieving no more utility than the base model-a phenomenon we call catastrophic Goodhart. We adapt a discrete optimization method to measure the tails of reward models, finding that they are consistent with light-tailed error.






TreeVI: ReparameterizableTree-structured VariationalInferenceforInstance-level CorrelationCapturing

Neural Information Processing Systems

Mean-field variational inference (VI) iscomputationally scalable, but its highlydemanding independence requirement hinders it from being applied to wider scenarios. Although many VI methods that take correlation into account have been proposed, these methods generally are not scalable enough to capture the correlation among data instances, which often arises in applications involving graphs or explicit constraints among instances.



5631e6ee59a4175cd06c305840562ff3-Paper.pdf

Neural Information Processing Systems

Ateachtimestepoftheepisode,thelearnerobserves the current state of the environment, chooses one of theK available actions, and earns a reward. Consequently, the state of the environment changes according to the transition function of the underlying MDP, as a function of the previous state and the action taken by the learner.