Reviews: Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems
–Neural Information Processing Systems
The restrictive assumption of restarting (which also significantly simplifies the regret analysis) was not mentioned. Note that the work by Liu, et.
Neural Information Processing Systems
Jan-22-2025, 15:09:36 GMT
- Technology: