Goto

Collaborating Authors

 Inductive Learning


Review for NeurIPS paper: Exemplar Guided Active Learning

Neural Information Processing Systems

The paper proposed to select unlabelled training examples based-on the embedding distance between the given exemplar and the query data. A pretrained BERT model is used to compute the embedding for the training examples. The problem formulation of selecting balanced labels in a highly skewed training set and the complexity bound is appreciated by all the reviewers. The general consensus is that the paper adds an interesting contribution to active learning methods applied to word sense disambiguation. The current version of the paper would be greatly strengthened by including more datasets.


Review for NeurIPS paper: CompRess: Self-Supervised Learning by Compressing Representations

Neural Information Processing Systems

Weaknesses: One big issue that I see is, it's not very meaningful to do model compression for unsupervised models before the current evolution of contrastive approaches plateau. Then why do we still need the distillation method proposed today, instead of directly using the new contrastive method? So this distillation method will quickly fade away. For instance, IIRC, [a] achieves 73.0% linear accuracy and transfer better to Pascal VOC and COCO, then the effectiveness of this paper will be largely discounted (by the way [a] might be discussed as well). But it's unclear how your method could improve upon better self-supervised methods, e.g., can you improve upon [a] using your method out of the box?


Review for NeurIPS paper: CompRess: Self-Supervised Learning by Compressing Representations

Neural Information Processing Systems

This paper presents an approach for distillation of self-supervised models. All the reviewers acknowledge that the paper present a simple approach which outperforms several baselines. There are some concerns with respect to: (a) speed with which SSL field changes and applicability to new approaches; (b) clarity of tables; (c) claim of better than alexNet supervised. There was a rebuttal which answered some of the concerns. The AC agrees with authors that we should not wait for better models before working on model compression.


Review for NeurIPS paper: LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration

Neural Information Processing Systems

Weaknesses: This is not a weakness per-say, but a suggestion to make the paper stronger. In juxtaposition to the existing work the authors present the argument several times that using a UV parameterization is inherently inferior to 3D representations, as it requires seam-cuts and results in distortion of highly curved regions, etc. While this is conceptually correct and true, it would have made the paper stronger if the authors had somehow demonstrated this to be true empirically as well for their problem. For example, perhaps via a simpler problem -- maybe for the fully-supervised case or for the case when the entire pipeline is not necessarily end-to-end differentiable, but a combination of a landmarks/correspondence estimation a traditional optimization approach. It would be interesting to see if the signed distance representation to predict correspondences with a CNN along with its Lagrangian loss formulation to encourage points to lie on the surface improves the accuracy of correspond prediction by itself and if so by how much versus an approach that learns to map scan points to the UV space instead.


Review for NeurIPS paper: LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration

Neural Information Processing Systems

The rebuttal addressed the main criticisms raised by the reviewers: the assumption on warm start, the robustness to noise, and the clarification of the model. The answers of the authors contributed to the discussion and the proper evaluation of this work. The terminological issue doesn't affect the final decision.


Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data

Neural Information Processing Systems

Clarity: Overall the paper is very clear. The authors did an excellent job. Equation 5 - I am confused on a few things. The notation P(y x; theta) is confusing because the semicolon implies that theta is a vector and not a random vector, however, the conditional distribution of theta is given P(theta G). So what is the point of the semicolon? Also, there is a typo in Equation 5 I think because the entropy term is not defined correctly.


Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data

Neural Information Processing Systems

R#2 and R#3 generally liked the paper. R#1 has a brief review that raised concern on novelty of the method. The rebuttal well addressed the concerns and made all reviewers increase their score. We have collected comments from an additional reviewer, who pointed out more issues on writing and the theoretical results (see blew). We advise the authors to take efforts to address these issues in the revision.


Reviews: Machine Teaching of Active Sequential Learners

Neural Information Processing Systems

This paper considers the problem of teaching an active sequential *machine learner* (e.g., an active learning algorithm), with a teacher which can "fake" the labels/outcomes of training examples with the goal of steering the learner faster to the goal state. The authors refer to such teacher a "planning teacher", as opposed to a "naive teacher" which is often considered in the classical machine teaching problems. This setting differs from conventional machine teaching settings, in that In classical machine teaching setting, the teacher can only choose among a given set of training examples that are consistent with the target concept, and is often not allowed to provide inconsistent examples. The majority of the existing work in machine teaching considers teaching a "passive learner", with a few exceptions (see additional reference in the comments below). The assumption that the teacher can choose the data-generation distribution makes it a very powerful teacher with a much richer action set than conventional teaching.


Review for NeurIPS paper: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Neural Information Processing Systems

Weaknesses: A weakness of this work is that from this work alone it's not clear why the proposed changes should work well for the problem domain. Moreover, why the interaction of the two proposed changes is so beneficial. While this is a problem in the body of work that effectively searches through the neural-network-architecture space, it would be very beneficial to try and focus on justifying more rigorously the design choices made. An example of how this could be done is designing a toy problem that exemplifies that pre-existing work cannot handle this case, and such, the proposed changes should be accepted. As a result of this, it's not clear how significant this work is/will be.


Review for NeurIPS paper: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Neural Information Processing Systems

This paper proposes an end-to-end self-supervised learning approach for speech representations. It can serve as the unsupervised pre-training for fast and robust deployment of automatic speech recognition systems, especially for those with low resource or limited amounts of labeled data. The authors reported compelling performance of the proposed technique on Librispeech and TIMIT. This is a strong paper and all reviewers are supportive for acceptance. Large-scale unsupervised pre-training has made great impacts in vision and NLP, the work reported here is analogous in the speech community in that effort.