Goto

Collaborating Authors

 Reinforcement Learning


Large-ScaleRetrievalforReinforcementLearning

Neural Information Processing Systems

Thisallows agents to directly learn in an end-to-end manner to utilise relevant information to inform their outputs. In addition, new information can be attended to by the agent, without retraining, by simply augmenting the retrieval dataset.


Large-ScaleRetrievalforReinforcementLearning

Neural Information Processing Systems

Thisallows agents to directly learn in an end-to-end manner to utilise relevant information to inform their outputs. In addition, new information can be attended to by the agent, without retraining, by simply augmenting the retrieval dataset.







Constrainedepisodicreinforcementlearningin concave-convexandknapsacksettings

Neural Information Processing Systems

Our approach relies on the principle ofoptimism under uncertaintyto efficiently explore. Our learning algorithms optimizetheiractions withrespect toamodel based ontheempirical statistics, while optimistically overestimating rewards and underestimating the resource consumption (i.e., overestimating the distance from the constraint).