Optimistic posterior sampling for reinforcement learning: worst-case regret bounds
–Neural Information Processing Systems
Neural Information Processing Systems
Nov-21-2025, 07:11:45 GMT
- Country:
- Europe > United Kingdom
- England > Greater London > London (0.04)
- North America > United States
- California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom
- Technology: