Reviews: Dueling Bandits: Beyond Condorcet Winners to General Tournament Solutions
–Neural Information Processing Systems
This paper deals with the dueling bandit problem in situations where the goal is to find other notions of winners than the Condorcet one (which is not guaranteed to exist) or the Copeland one (which does always exist, but has certain flaws). Unlike the MAB problem, the question of what constitutes the best arm(s) is not completely straightforward to answer in the dueling setting, in particular when there is no single arm that is preferred to all other arms (aka the Condorcet winner). Indeed, the field of social choice theory has devised a near infinite supply of such notions of winners: the reason for this overabundance of definitions is that in the absence of a Condorcet winner, it is difficult to find a definition that satisfies all reasonable properties that one might hope for, so various definitions have been proposed that satisfy various sets of desirable properties. Given that, it is natural to seek algorithms that can find these other winners, since depending on the application domain, it might very well be that the Condorcet winner does not exist and that the Copeland winner is undesirable. With this in mind, the authors propose a flexible algorithm that is capable of accommodating such needs, given certain rather weak assumptions on the sought after notion of winner.
Neural Information Processing Systems
Jan-20-2025, 23:06:30 GMT