Supervised Learning
Review for NeurIPS paper: Provably adaptive reinforcement learning in metric spaces
This paper is about model-free RL where the state-action state is a metric space. An improved analysis of an existing algorithm (with some modifications) is shown to achieve a regret that scales with the zooming dimension of the metric space, instead of the covering dimesion. A general consensus among reviewers emerged that this theoretical RL paper is well executed, and provides a reasonable though not groundbreaking contribution to the RL literature.
Reviews: Structured Prediction with Projection Oracles
Post-feedback update: Thanks for your update. Your additional explanations and results will help improve the paper, and I definitely think this work is strong and should be accepted. The framework itself is new, and the authors make it very clear how prior work fits into the framework as special cases. At the same time, a good case is made for why this framework is useful to have and how it can be better to use than prior losses. Quality: this paper makes a compelling case for the framework it introduces.
Reviews: Structured Prediction with Projection Oracles
All reviewers agreed that this paper make a nice contribution to NeurIPS by providing a novel general framework for generating calibrated surrogate loss functions for structured prediction problems. On the other hand, in discussion, they also stressed that including some baselines (e.g., SSVM/CRF approximation/SPEN) in the experiments and reporting runtimes could make this paper much stronger. The authors should implement their promised changes in the camera-ready version.
Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
Except the presence of each edge is probabilistic than deterministic, the core idea is quite similar to Isomap. The novelty should be better addressed by comparing to Isomap. For example, edges between words that frequently co-occur in the same contexts are not independent to each other. Edges between pixels in small coherent regions are not independent. Do we eventually need to know such dependency structures a priori to correctly represent arbitrary geometry in the data?
Reviews: Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
The paper proposed a quite interesting idea of representing data by weighted graphs (shortest path between nodes). Reviewers have raised concerns on edge dependency and the given similarity metric. However, I'm less worried about making the independence assumption because after all, it's a model, and it seems to work well in experiments. Likewise, it is also common in variational inference to use independent distribution to approximate a graphical model, based on which learning is carried out. What interests me more is the general methodology of optimization.
Reviews: Deep Structured Prediction for Facial Landmark Detection
The integration of convnets with the conditional random fields to model the structural dependencies of facial landmarks during face alignment is nice contribution. Previously proposed methods in this direction were hybrid systems (eg. OpenFace versions) and not fully integrated. The authors evaluate on multiple datasets (300W, 300W-Video, Menpo & COFW-68) and compare results with other methods. Both inter- and cross-dataset performance are provided.
Reviews: Exact inference in structured prediction
Overview: - This paper studies the conditions for exact recovery of ground-truth labels in structured prediction under some data generation assumptions. In particular, the analysis generalizes the one in Globerson et al. (2015) from grid graphs to general connected graphs, providing high-probability guarantees for exact label recovery which depend on structural properties of the graph. On the other hand, the assumed generative process (lines 89-101, proposed in Globerson et al., 2015) is somewhat toyish which might make the results less interesting. Therefore, I am inclined towards acceptance but not strongly. Comments: - I feel like the presentation can be greatly improved by including an overview of the main result at the beginning of Section 3. In particular, you can state the main result, which is actually given in Remark 2 (!), and then provide some high-level intuition on the path to prove it.
Reviews: Exact inference in structured prediction
The paper gives a theoretical analysis of Markov random fields. The authors answer the question of when exact inference can be done exactly in a polynomial time. This is a generalization of a result of in Globerson et al. (2015) from grid graphs to general connected graphs, which is on my opinion, a non-trivial generalization. The paper is self contained and readable for the Machine Learning community, although quite technical. Indeed, I consider that it is a theoretical paper that has all the quality for a NeurIPS acceptance.
Review for NeurIPS paper: Structured Prediction for Conditional Meta-Learning
Especially, more task conditioning methods (e.g., MMAML) are considered in this paper. However, my major concern has not been addressed. The authors still ignore the discussion with multi-task learning. From my perspective, the goal for meta-learning is to generalize knowledge from previous tasks, which further benefits the training of a new task. The setting in this paper allows a new meta-testing task to access all meta-training tasks.