d010396ca8abf6ead8cacc2c2f2f26c7-Paper.pdf
–Neural Information Processing Systems
A multi-armed bandit (MAB) problem is one of the classic models of sequential decision making (Auer et al., 2002; Agrawal and Goyal, 2012, 2013a).
Neural Information Processing Systems
Feb-19-2026, 09:45:54 GMT