
Commentsonpresentation: Thank you forthehelpful suggestions. We focus on Mercer kernel with formkθ(zt, z) = qθ(z|t z) = h|t h. Even so, we obtain SOTA results for recurrent models on all document classification tasks, with the19 exceptionofAGNews,forwhichwe'recompetitive.