Goto

Collaborating Authors

 Asia






Context Shift Reduction for Offline Meta-Reinforcement Learning Y unkai Gao

Neural Information Processing Systems

Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks.




Reranking Laws for Language Generation: A Communication-Theoretic Perspective

Neural Information Processing Systems

To ensure large language models (LLMs) are used safely, one must reduce their propensity to hallucinate or to generate unacceptable answers. A simple and often used strategy is to first let the LLM generate multiple hypotheses and then employ a reranker to choose the best one.