Reviews: Faster Online Learning of Optimal Threshold for Consistent F-measure Optimization
–Neural Information Processing Systems
A) My main concern with this paper is with respect to the main results (Theorems 2 and 3). It seems the authors have not put sufficient care to the fact that \partial \hat{Q} in Algorithm 2 is a biased estimator of the true gradient \partial Q. Also, \hat{Q} defined in Line 189 depends on \hat{\pi} which is an estimate of \pi. Thus, a probabilistic proof would require to look at a conditional probability of the estimation of Q depending on the estimation of \pi. B) Regardless of the above, the final high probability statement in Theorems 2 and 3, seem to be missing the union bound of the error probability in Assumption 1.
Neural Information Processing Systems
Oct-7-2024, 11:36:37 GMT