Reviews: Re-evaluating evaluation

Neural Information Processing Systems 

This paper applies game theory (of Nash equilibria, though 2-player zero-sum games suffice for most of the arguments) to the problem of evaluating a number of agents on a slate of tasks (and/or against other agents). The evaluation methods presented have the important property of being robust to non-systematic addition of more agents and tasks. The paper casts the problem in a correct mathematical framework leveraging the Hodge decomposition of skew-symmetric matrices, and its generalization in combinatorial Hodge theory. This allows for a very illuminating and general view of the structure of evaluation, leading to a generalization of the Elo rating that eschews the transitivity assumption by embedding ratings in more dimensions, as is required in general, and a "Nash averaging" method which I think will be the more lasting contribution. The position taken by the paper on evaluation is also increasingly important at the current time, in my opinion. There are strong connections to the contextual bandits setting in the dueling case, in which feedback is limited to preferences among pairs of actions, which the learner chooses on the fly ([1] and related work).