OptimalAlgorithmsforStochasticContextual PreferenceBandits
–Neural Information Processing Systems
Yet, same as the classical setup, the goal is still to compete against the best context arm at each round.
Neural Information Processing Systems
Feb-12-2026, 01:01:32 GMT