Multi-armedBanditRequiringMonotoneArm Sequences

Neural Information Processing Systems 

Popular algorithms suchasUCB[4,5]andThompson sampling [3,34]typically explorethearms sufficiently and as more evidence is gathered, converge to the optimal arm.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found