Goto

Collaborating Authors

 Overview


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper reduces a broad class of machine learning problems involving latent variables to the problem of finding anchors defining the conical hull of the data (via the method of moments). In addition, it proposes a new divide-and-conquer algorithm based on random projections to speed up the search for the anchors. Overall, I found this an interesting paper presenting significant contributions. However the presentation could be greatly improved as it lacks clarity here and there. It looks like this paper was squeezed in a hurry to fit the 8-page limit.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

The authors discuss how the problems can be formulated as optimization of objective functions defined on the subgraphs. A straightforward search over the subgraphs is computationally infeasible, so the authors present a highly novel approach that leads to computationally efficient tests. The paper includes proofs that the tests are nearly minimax optimal for the exponential family of distributions and graphs satisfying the polynomial growth property. The paper concludes with an analysis of synthetic and real datasets. Strengths: (1) The paper addresses a problem of growing importance and presents novel approaches for statistical tests.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1612" "Title:","Improved Distributed Principal Component Analysis" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper considers the problem of trading off the communication and computation cost of distributed computation and proposes a new distributed k L-2 error fitting algorithm. The proposed algorithm can be seen as a combination of many previous speed up techniques for distributed PCA and clustering methods. However, the authors also contribute optimizations over the base methods and further improves the communication and computation efficiency. The theoretical guarantee is sound and experiments are convincing.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper studies the estimation of the k-dimensional principal subspace of a population matrix based on sample covariance matrix. Two estimators based on convex and non-convex optimizations are developed for projection matrix with large or small magnitude entries, respectively. Both these two estimators are shown to enjoy satisfactory theoretical properties and experimental results compared with state-of-the-art estimators. It would be better to clearly explain what the oracle knowledge used in the proposed algorithm is, and how to set up the oracle estimator comparison experiments.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper introduces a novel convex region-specific linear models called partition-wise linear model. It assigns linear models to partitions of the input space and linear combination of these partition-specific models define the region-specific linear models. This allows them to construct convex objective functions. They optimize both the regions and predictors by using sparsity inducing structured penalties.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper studies the multi-armed bandit problem where they have a set of relevant features; and the expected reward of an action is a Lipschitz continuous of relevant features. This is also a feature selection problem where you have a set of features but only r of them are relevant (the target function only depends on r of these features): here each arm has only one relevant feature, meaning the function representing the arm payoff depending on only one feature and we do not know which one. They propose an algorithm and get the bound for such adaptive case; but their regret is higher than what you would get if someone tells you the relevant type. Q2: Please summarize your review in 1-2 sentences This paper makes a small step towards understanding the problem of having a subset of features being relevant for a given arm which itself is certainly an interesting problem: they study the bandit problem only for one relevant feature per arm and did not give the optimal rate. Potentially, they could go with all arbitrary number of relevant features and figure out the optimal regret.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper investigate fast convergence properties of proximal gradient method and proximal Newton method under the assumption of Constant Nullspace Strong Convexity (CNSC). The problem of interest is to minimize the sum of two convex functions f(x)+h(x), where f is twice differentiable (smooth) and h can be non-smooth but admits a simple proximal mapping. Under the CNSC assumption on f and assuming h has the form of decomposable norm, this paper showed global geometric convergence of the proximal gradient method, and local quadratic convergence of the proximal Newton method. Writing of this paper is very clear.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper describes a non-negative matrix factorization for tall and skinny matrices. This algorithm works in the bigdata scenario because it need only pass over the tall skinny matrix one time. This linear read of the tall skinny matrix does not fully utilize the distributed mapreduce framework. I wonder, is it possible to parallelize the reading of the matrix and combine the results from subsets of the data into one final result?


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper studies the rank aggregation problem where a global ranking is inferred from multiple partial rankings. While assuming the partial rankings are generated according to the Plackett-Luce (PL) model, some of the results in the paper apply to the more general Thurstone's model as well. It provides theoretical results quantifying the required number of item assignments from users and analyzes the case where only pairwise comparisons are used as aggregation input. I find the results of the latter, i.e., rank-breaking upper bounds, especially interesting.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors propose a novel approach for hierarchical clustering of multivariate data. They construct cluster trees by estimating minimum volume sets using the q-One-Class SVM, and evaluate their method on a synthetic data set and two real word applications. While their new method seems to perform better than other approaches based on density estimation, I am not convinced by the benefits in practical applicability as the authors did not compare their method to the most commonly used hierarchical clustering techniques (agglomerative clustering with average linkage/ward). Minor comment: Rather than splitting their data once in a training and test set, the authors should perform 10-fold/5-fold cross-validation for a more reliable estimation of the generalizability of their method.