Reviews: An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing Systems 

Summary: The authors provided an empirical study contrasting neural networks and kernel methods, with a focus on how random and adaptive schemes would make efficient use of features in order to improve quality of learning, at four levels of abstraction: data-agnostic random basis (baseline kernel machines with traditional random features), unsupervised data-adaptive basis for better approximation of kernel function, supervised data-label-adaptive basis by kernel target alignment, discriminatively adaptive basis (neural nets). The paper concluded with several suggestions and caveats for efficient use of random features in practice. Comments: - 1 - Line 123, especially for sake of comparing UAB case where the underlying assumption is that using the true kernel function k in prediction yields the "best" performance so that UAB tries to approximate it, I would suggest testing in experiments a baseline model that utilizes the true kernel function k in prediction. Also this would suggest, for example in Figure 1 at which point of the KAE curve the accuracy is sufficiently good (despite many theoretical results available). However, in order to support those conclusions to a definitively convincing extent, more datasets should be needed. For example, the performance scores in Tab. 1 do not seem to be too significantly different marginally for each task.