Reviews: Near-Optimal Smoothing of Structured Conditional Probability Matrices

Jan-20-2025, 15:24:46 GMT–Neural Information Processing Systems

If my understanding is correct, Theorem 1 of the authors does not quite apply to their algorithm ADD-1/2-Smoothed Low-Rank. Instead, it applies to the non-computable algorithm where they assume that they have a minimizer of the objective function in Theorem 3. It is not clear if the alternating optimization algorithm proposed in the paper is guaranteed to converge to a minimizer of the objective in Theorem 3. If this is true, the authors should mention this before stating Theorem 1 to avoid misleading the reader. The "discounting" seems important from the Experiments section but this is not described in the main paper. If this is so important, the authors should make room for this in the main paper. The main results (Theorem 1 and 2) are not so surprising given that this is almost a parametric estimation problem with mk parameters (so the rates should be km/n).