FastPureExplorationviaFrank-Wolfe
–Neural Information Processing Systems
ConsiderK arms whose reward distributions (ν1,...,νK) come from a one-dimensional exponential family and are of unknown means µ=(µ1,...,µK).
Neural Information Processing Systems
Feb-8-2026, 01:57:12 GMT
- Technology: