Review for NeurIPS paper: Optimal Iterative Sketching Methods with the Subsampled Randomized Hadamard Transform
–Neural Information Processing Systems
Additional Feedback: Please find below a list of questions and comments: - 1) Did you experiment applying the truncated Walsh-Hadamard transform (Ailon & Liberty. In Discrete & Computational Geometry, 2009.) when using SRHT? - 2) lines 199-207 Could you comment on the fact that m is constrained to be greater than d? Would it be possible to achieve better performances with smaller m? Is there any theory about this? - 3) Could you please precisely give the references claiming that m \approx d \log(d) is prescribed for state-of-the-art algorithms and for which algorithm? - 4) Below Theorem 4.1, there is an explanation on why this additional assumption \mathbb{E} \[ \Delta_0 \Delta_0 \top \] (1/d) I_d is a "mild assumption". I did not understood the provided argument. Is this assumption often/easily met? - 5) lines 222-223 and 264-265 it is mentioned that "SRHT [...] contains less randomness, but is more structured and faster to generate" than Haar matrix.
Neural Information Processing Systems
Jan-25-2025, 13:24:59 GMT
- Technology: