Online Learning with Set-Valued Feedback
Raman, Vinod, Subedi, Unique, Tewari, Ambuj
We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback. In this model, the learner is penalized for not outputting a label contained in the revealed set. We show that unlike online multiclass learning with single-label feedback, deterministic and randomized online learnability are \textit{not equivalent} in the realizable setting under set-valued feedback. In addition, we show that deterministic and randomized realizable learnability are equivalent if the Helly number of the collection of sets that can be revealed as feedback is finite. In light of this separation, we give two new combinatorial dimensions, named the Set Littlestone and Measure Shattering dimension, whose finiteness characterizes deterministic and randomized realizable learnability respectively. Additionally, these dimensions lower- and upper bound the deterministic and randomized minimax regret in the realizable setting. Going beyond the realizable setting, we prove that the Measure shattering dimension continues to characterize learnability and quantify minimax regret in the agnostic setting. Finally, we use our results to establish bounds on the minimax regret for three practical learning settings: online multilabel ranking, online multilabel classification, and real-valued prediction with interval-valued response.
Nov-19-2023