Oceania
SupplementaryMaterial
Recall the definition: a set function f(S) is submodular, if for any subsets S S0 Z, and i Z S0, f(S {i}) f(S) f(S0 {i}) f(S0). For experiments in section 5.2, all checkpoints are instances of resnet-50. They are trained by a batch size of 128, and an initial learning rate of 0.1. We run for 200 epochs, with learning rate decay at the 60th, 120th and 160th epoch. A typical validation accuracy from these checkpoint (on its own task) is about 83% (reasonably good).
Learning
Whiletheseapproaches arewidely used inpractice andachieveimpressiveempirical gains, their theoretical understanding largely lags behind. Towards bridging this gap, we present a unifying perspectivewhere several such approaches can beviewed asimposing a regularization on the representation via alearnable function using unlabeled data. Wepropose adiscriminativetheoretical framework for analyzing the sample complexity of these approaches, which generalizes the framework of [3] to allow learnable regularization functions.