Towards Provable Emergence of In-Context Reinforcement Learning
–Neural Information Processing Systems
Typically, a modern reinforcement learning (RL) agent solves a task by updating its neural network parameters to adapt its policy to the task. Recently, it has been observed that some RL agents can solve a wide range of new out-of-distribution tasks without parameter updates after pretraining on some task distribution. When evaluated in a new task, instead of making parameter updates, the pretrained agent conditions its policy on additional input called the context, e.g., the agent's interaction history in the new task. The agent's performance increases as the information in the context increases, with the agent's parameters fixed. This phenomenon is typically called in-context RL (ICRL). The pretrained parameters of the agent network enable the remarkable ICRL phenomenon.
Neural Information Processing Systems
Jun-15-2026, 17:19:33 GMT
- Country:
- North America > United States > Virginia (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education (0.46)
- Information Technology (0.46)
- Technology: