Goto

Collaborating Authors

 low-rank data



Approximate Cross-Validation with Low-Rank Data in High Dimensions

Neural Information Processing Systems

Many recent advances in machine learning are driven by a challenging trifecta: large data size $N$, high dimensions, and expensive algorithms. In this setting, cross-validation (CV) serves as an important tool for model assessment. Recent advances in approximate cross validation (ACV) provide accurate approximations to CV with only a single model fit, avoiding traditional CV's requirement for repeated runs of expensive algorithms. Unfortunately, these ACV methods can lose both speed and accuracy in high dimensions --- unless sparsity structure is present in the data. Fortunately, there is an alternative type of simplifying structure that is present in most data: approximate low rank (ALR).



Review for NeurIPS paper: Approximate Cross-Validation with Low-Rank Data in High Dimensions

Neural Information Processing Systems

Weaknesses: I think the significance of the results (maybe because of the delivery of the result) is below the threshold of acceptance. 1) The first weakness is that there is no discussion about whether the upper bound (mentioned in the strengths) is tight and when this upper bound implies consistency, i,e., the error goes to 0 under a certain limit. Note that the norm of the true signal, the scale of the feature matrix, and the best tuning parameter need to satisfy certain order conditions such that the problem becomes meaningful. A common approach is to apply PCA and do feature selection first. Then, the authors should compare their results with prior works on the selected features. After response: I noticed corollary 1 and corollary 2. But these two corollaries together only cover the trivial case when sample size goes to infinity while the rank of feature matrix is bounded by constant.


Review for NeurIPS paper: Approximate Cross-Validation with Low-Rank Data in High Dimensions

Neural Information Processing Systems

Two reviewers agree that this submission represents an important contribution to the field. However, a third expressed significant concerns about the tightness of the presented bounds, the accommodation of matrices with growing rank, and behavior in the presence of principal component preprocessing. Please be sure to carefully review and address the concerns of all reviewers in the revision.


Approximate Cross-Validation with Low-Rank Data in High Dimensions

Neural Information Processing Systems

Many recent advances in machine learning are driven by a challenging trifecta: large data size N, high dimensions, and expensive algorithms. In this setting, cross-validation (CV) serves as an important tool for model assessment. Recent advances in approximate cross validation (ACV) provide accurate approximations to CV with only a single model fit, avoiding traditional CV's requirement for repeated runs of expensive algorithms. Unfortunately, these ACV methods can lose both speed and accuracy in high dimensions --- unless sparsity structure is present in the data. Fortunately, there is an alternative type of simplifying structure that is present in most data: approximate low rank (ALR).