Goto

Collaborating Authors

 Learning in High Dimensional Spaces


Super-Bit Locality-Sensitive Hashing

Jianqiu Ji, Jianmin Li, Shuicheng Yan, Bo Zhang, Qi Tian

Neural Information Processing Systems

Sign-random-projection locality-sensitive hashing (SRP-LSH) is a probabilistic dimension reduction method which provides an unbiased estimate of angular similarity, yet suffers from the large variance of its estimation. In this work, we propose the Super-Bit locality-sensitive hashing (SBLSH). It is easy to implement, which orthogonalizes the random projection vectors in batches, and it is theoretically guaranteed that SBLSH also provides an unbiased estimate of angular similarity, yet with a smaller variance when the angle to estimate is within (0, /2]. The extensive experiments on real data well validate that given the same length of binary code, SBLSH may achieve significant mean squared error reduction in estimating pairwise angular similarity. Moreover, SBLSH shows the superiority over SRP-LSH in approximate nearest neighbor (ANN) retrieval experiments.


Review for NeurIPS paper: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality

Neural Information Processing Systems

Summary and Contributions: ***** UPDATE ***** I realize I might have been harsh in my evaluation. I believe the paper would have been more suited for a more theory oriented statistics conference / journal, but this is a recurrent problem in NeurIPS and I shouldn't have taken it out on the authors. While their theoretical result is really interesting, I also didn't appreciate that the authors barely mentioned previous work on statistical learning bounds with optimal transport. There have been recent efforts on the topic by several teams, and they should at least acknowledge them. However, if other reviewers took the time to thoroughly review the proof of the main result, I'm willing to increase my score.


Review for NeurIPS paper: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality

Neural Information Processing Systems

Most of the reviewers were excited about this work, and I'm pleased to recommend it for publication. In the revision, please address all promised changes in the rebuttals and/or requested in the reviews. The outlier R1 has some valid points about the exposition as well as discomfort with the length of the appendix (it's true this is difficult to review in the NeurIPS environment), but these are not reasons to reject the work. That said, the authors of this paper are encouraged to take R1's expository suggestions seriously in their revision to make the work as approachable as possible.


Review for NeurIPS paper: Sufficient dimension reduction for classification using principal optimal transport direction

Neural Information Processing Systems

Do q in Line 22 and r in Line 185 denote the same thing? Under what conditions does the equivalence hold? Do these conditions automatically hold for this paper? Is the original dimensionality prohibits the evaluation? Unfortunately, the authors dodged most of my questions that may hurt the paper and my concerns still stand.


Review for NeurIPS paper: Sufficient dimension reduction for classification using principal optimal transport direction

Neural Information Processing Systems

Two reviewers support accept, and two reviewers indicate reject. I've looked at the paper, reviews and rebuttal, and recommend an accept decision based on the significance, theoretical grounding and novelty of the contribution. It is refreshing to see a new approach to a problem relevant to the NeurIPS community. However, please revise the paper to provide better explanation of terminology and intuition as to why the method works; the reviewers' additional feedback and post-rebuttal comments should also be carefully considered in the revisions.


Review for NeurIPS paper: Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Neural Information Processing Systems

Weaknesses: Although this paper has an obvious improvement on Gao et al.'s work, I have to say that it lacks novelty, and the contribution is small. Width, runing time and activation funtion are not a huge gap in Gao et al.'s work. What I really want to see is to remove projection in deep net, improve the online learning or analysis the robust generalization. However, both the results and the proof methods have little inspiration. This paper claims that '(Gao et al.) require the width of the net and the running time to be exponential in input dimension d, and they consider an activation function that is not used in practice' in abstract.


Review for NeurIPS paper: Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Neural Information Processing Systems

This paper proves that adversarial training of over-parameterized neural networks converges to a robust solution. Specifically, the paper studies two-layer ReLU networks with width that is polynomial in the input dimension, d, the number of training points, n, and the inverse of the robustness parameter, 1/\epsilon. The proof is by construction; an algorithm is proposed that, in poly(d, n, 1/\epsilon) iterations, finds a network with poly(d, n, 1/\epsilon) width that is \epsilon-robust. Adversarial training is an important and rapidly expanding field of ML. This paper fills in some gaps w.r.t.


Unified Lower Bounds for Interactive High-dimensional Estimation under Information Constraints

Neural Information Processing Systems

We consider distributed parameter estimation using interactive protocols subject to local information constraints such as bandwidth limitations, local differential privacy, and restricted measurements. We provide a unified framework enabling us to derive a variety of (tight) minimax lower bounds for different parametric families of distributions, both continuous and discrete, under any \ell_p loss. Our lower bound framework is versatile and yields "plug-and-play" bounds that are widely applicable to a large range of estimation problems, and, for the prototypical case of the Gaussian family, circumvents limitations of previous techniques. In particular, our approach recovers bounds obtained using data processing inequalities and Cramér–Rao bounds, two other alternative approaches for proving lower bounds in our setting of interest. Further, for the families considered, we complement our lower bounds with matching upper bounds.


Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning

Neural Information Processing Systems

As annotations of data can be scarce in large-scale practical problems, leveraging unlabelled examples is one of the most important aspects of machine learning. This is the aim of semi-supervised learning. To benefit from the access to unlabelled data, it is natural to diffuse smoothly knowledge of labelled data to unlabelled one. This induces to the use of Laplacian regularization. Yet, current implementations of Laplacian regularization suffer from several drawbacks, notably the well-known curse of dimensionality.


SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities

Neural Information Processing Systems

Many approaches in machine learning rely on a weighted graph to encode thesimilarities between samples in a dataset. Entropic affinities (EAs), which are notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are particular instances of such graphs. To ensure robustness to heterogeneous sampling densities, EAs assign a kernel bandwidth parameter to every sample in such a way that the entropy of each row in the affinity matrix is kept constant at a specific value, whose exponential is known as perplexity. EAs are inherently asymmetric and row-wise stochastic, but they are used in DR approaches after undergoing heuristic symmetrization methods that violate both the row-wise constant entropy and stochasticity properties. In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently using dual ascent.