Goto

Collaborating Authors

 Search


Review for NeurIPS paper: Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Neural Information Processing Systems

Additional Feedback: The main contribution of this paper is a single-loop stochastic algorithm which achieves the best-known complexity bound and outperforms the existing prox-linear algorithms in computational sense. To make this argument more convincing, I hope the authors can address the following concerns: 1. Clearly state why your algorithm outperforms prox-linear algorithms. Indeed, by simply exploring the structure of stochastic problem in Eq.(1), the prox-linear subproblem can be reformulated using conjugate function and becomes the same as your subproblem. Indeed, when b and \psi are in special forms, I agree that we can achieve the closed-form solution as you point out. However, this also holds true for prox-linear algorithms and a few other double-loop algorithms.


Review for NeurIPS paper: Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Neural Information Processing Systems

The paper introduces a single-loop stochastic algorithm for solving a special class of nonconvex-concave minimax problems that achieves best-known complexity bound. The rebuttal addressed most of the reviewers' concerns on the algorithmic justification, although some concern remains in terms of the special structure. However, please consider revising the paper to address R1 and R3 's remarks, in particular: - Adjust the title to reflect the special structure instead of overclaim the contribution; - Elaborate the desirable property of single-loop algorithm over existing methods; - Add detailed comparisons to prior work including prox-linear algorithms for compositional problems and recent algorithms for general nonconvex-concave minimax problems.



Review for NeurIPS paper: Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Neural Information Processing Systems

The reviewers generally liked this paper and also provided a number of suggestions for improvement. Please take these recommendations seriously when revising the paper. In particular, I agree with Reviewer 4 that the informal theorem statements in the main body obscure many details. Theorem 2, in particular, seems to be simultaneously too formal (do we need all these exact numeric constants?), while also obscuring important details. Overall, the ideas are interesting but I found the paper somehow a bit messy to read.


Letters, Colors, and Words: Constructing the Ideal Building Blocks Set

arXiv.org Artificial Intelligence

Define a building blocks set to be a collection of n cubes (each with six sides) where each side is assigned one letter and one color from a palette of m colors. We propose a novel problem of assigning letters and colors to each face so as to maximize the number of words one can spell from a chosen dataset that are either mono words, all letters have the same color, or rainbow words, all letters have unique colors. We explore this problem considering a chosen set of English words, up to six letters long, from a typical vocabulary of a US American 14 year old and explore the problem when n = 6 and m = 6, with the added restriction that each color appears exactly once on the cube. The problem is intractable, as the size of the solution space makes a brute force approach computationally infeasible. Therefore we aim to solve this problem using random search, simulated annealing, two distinct tree search approaches (greedy and best-first), and a genetic algorithm. To address this, we explore a range of optimization techniques: random search, simulated annealing, two distinct tree search methods (greedy and best-first), and a genetic algorithm. Additionally, we attempted to implement a reinforcement learning approach; however, the model failed to converge to viable solutions within the problem's constraints. Among these methods, the genetic algorithm delivered the best performance, achieving a total of 2846 mono and rainbow words.


Review for NeurIPS paper: Deep Subspace Clustering with Data Augmentation

Neural Information Processing Systems

The proposed policy method found via the proposed greedy search strategy results outperforms policies found in the fully-supervised setting of ImageNet classification (by AutoAugment and practitioners). However, it is hard to tell if a different search method would result a better policy. It would be good to include baselines for the search method. It would be good to discuss this. It would also be good to discuss related work on searching for data augmentation policies (e.g. It would be nice to also show results on using the learnt features for a downstream tasks (e.g.


Reviews: Learning Compositional Neural Programs with Recursive Tree Search and Planning

Neural Information Processing Systems

It instead learns the hierarchy of program subroutines in a curriculum fashion, adding a pre- and post-condition to each subroutine and extending the MCTS setup of AlphaZero to handle recursive subroutine calls. The paper demonstrates that the resulting formulation learns the programs in both Sorting and TowersOfHanoi domains more effectively than prior work.


Reviews: Learning Compositional Neural Programs with Recursive Tree Search and Planning

Neural Information Processing Systems

The authors should be commended for an excellent submission to NeurIPS. The concerns about clarity the reviewers raised seem to be addressable as the authors describe in their rebuttal. The topic: "unsupervised" (really, less-supervised) structured neural program induction is perfect for NeurIPS and the empirical results on sorting and other tasks as compared to the original neural programmer interpreter are exciting.


Review for NeurIPS paper: Interstellar: Searching Recurrent Architecture for Knowledge Graph Embedding

Neural Information Processing Systems

The motivation for defining "path interstellar" is strong and clearly stated. By comparing the learning ability of triplet-based, path-based, and GCN-based methods, the path interstellar (Definition 1) is proposed as the basic model to learn from KGs. This motivation has also been verified by a case study on synthetic data (experiments in section 4.2). - Domain-specific and well-defined search space. The authors propose a novel recurrent search space specific for the path learning problem. The searched components are either motivated by the models in the literature (combinators, activations) or by the learning problem (connections).


Review for NeurIPS paper: Minimax Bounds for Generalized Linear Models

Neural Information Processing Systems

The novelty of this paper seems questionable, mainly in view of [23]. Specifically, [23] studied a similar problem for generalized linear models where the only difference seems to be that the estimation error was considered instead of the prediction error. The technical steps are also very close to each other: both work reduced to Bayesian entropic loss, then the result of [24] was invoked to show that an upper bound on the Fisher information is sufficient, and finally the authors provided upper bounds on the Fisher information. Of course the last step is different; however this difference does not seem to add too much novelty. First, some problems suffered in the previous approaches can be easily fixed.