Goto

Collaborating Authors

 Inductive Learning


Reviews: Equality of Opportunity in Supervised Learning

Neural Information Processing Systems

It treats an incredibly important and foundational problem (fairness), proposes a creative but simple new definition, gives techniques for achieving the definition, proves theorems with regards to optimality, and even provides empirical results. As learning algorithms are used more and more broadly in situations where their decisions affect people's lives, fairness of these algorithms becomes a critical technical, social, and legal problem. While there is certainly no single "right" definition and paradigm when it comes to fairness, this definition seems to clearly be *a* right definition. It's so clean and simple that in retrospect, it seems obvious--a sign of an excellent idea. One of the many things I love about this definition and this work is how it shifts the structure of power and incentives--once a learner is constrained to be fair, under either of the definitions proposed, she is immediately incentivised to gather more data or make other efforts to do a better job of understanding protected populations.


Reviews: Improved Techniques for Training GANs

Neural Information Processing Systems

The results presented in the paper are impressive and significant enough. However, the results are quite empirical, non-conclusive, and lack of theoretical justification. For rebuttal, please focus on answering the (*), (**), and (***) mentioned in the following paragraphs. Reviewer is willing to change score if all the questions are well addressed. Novelty: The techniques proposed in the paper is novel in general. However, the proposed technique "feature matching" when training GAN has been explored to some extent: -- Generating Images with Perceptual Similarity Metrics based on Deep Networks by Dosovitskiy and Brox -- Autoencoding beyond pixels using a learned similarity metric by Larsen et al.


Reviews: A Consistent Regularization Approach for Structured Prediction

Neural Information Processing Systems

In my view, this is a beautiful paper that will advance the field of structured prediction significantly and provides a platform for further development. Nevertheless, the paper should be better related to existing work on vector-valued regression for structured output. A recent related work is but there are others: C eline Brouard, Florence D'Alch e-Buc, Marie Szafranski. The paper is generally well written, I have only few remarks: - line 70-72: you might note already here that this amounts to a ridge regression problem in the output Hilbert space. Good to mention it already here.


Reviews: A Minimax Approach to Supervised Learning

Neural Information Processing Systems

The technical results appear to be correct and the experimental results (which I think are quite preliminary) suggest the minimax SVM might be a good idea. I think the idea of robust Bayes decision rules makes sense and the authors show how under squared loss a connection to the Huber loss emerges. My main comment is that the paper itself is a somewhat difficult read due to terseness at key places, which might limit the impact of the paper. So, the rest of my comments are just geared towards improving the clarity of the paper. Technically, in every instance where the authors apply Danskin's theorem, it was not really clear what form of Danskin's theorem was being used, and therefore it was difficult to follow the derivation.


Reviews: Stochastic Structured Prediction under Bandit Feedback

Neural Information Processing Systems

Summary: This paper proposes a stochastic online learning method for the task of structured prediction. In this setting, the learner doest not get the correct structured output during training. Instead, it only gets bandit feedback from the labeler. The paper first proposes an online learning algorithm that learns model parameters via stochastic gradient descent; generalizes the learning method to pair-wise comparison of structured outputs; provides an optimization approach with Cross-Entropy Minimization; and theoretically analyzes the convergence property of the optimization approach. Pros: The paper proposes an online stochastic learning algorithm for minimizing the expected loss of structured predictions; gives a method of learning from pair-wise comparisons; and theoretical analyze the convergence rate.


Reviews: Supervised learning through the lens of compression

Neural Information Processing Systems

Most of the results established in the paper would, in the special case of binary classification, trivially follow from the known upper and lower bounds on sample complexity based on the VC dimension. However, the results were not previously known for multiclass learning, and other general loss functions. The results for the 0-1 loss are not particularly surprising, but it is good to know that, for instance, in multiclass classification with the 0-1 loss, the complexity measure in the agnostic sample complexity is the same as that in the realizable-case (up to log factors, but no extra factors such as log( Y) not present in the realizable-case sample complexity). They also prove a tighter lower bound than previously known for the sample complexity of uniform convergence for multiclass classification in Theorem 3.6. The techniques used in the proofs are mostly straightforward or have appeared in other related contexts previously.


Reviews: More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning

Neural Information Processing Systems

This paper is interesting and deals with new kind of results introducing computational aspects in standard minimax theory. The phenomenon illustrated is new to me, and present some limitation of the computationally tractable algorithm w.r.t. "theoretical" ones that could be considered in the classical minimax theory. However, due to the relative novelty of the framework, it would be important that basic definitions and properties be better presented. In the following there is only one model investigated.


Reviews: Structured Prediction Theory Based on Factor Graph Complexity

Neural Information Processing Systems

The paper is well written and motivated. In particular the problem considered is relevant. On the downside there are some issues related to the interpretability of the presented results: - In Theorem 1 the generalization error is bounded in terms of the additive or multiplicative empirical margin losses. However their formulation at Eq. (5) and (6) is hard to interpret and would benefit from a comment. This is problematic since it is not clear how these quantities are related to the algorithmic approaches discussed in Sec. 5.


Reviews: Supervised Learning with Tensor Networks

Neural Information Processing Systems

The idea of using tensor networks for machine learning problems is interesting. Unfortunately, there are many important issues that prevent the publication of the manuscript as it is now. My main concerns are the following: 1- Some parts of the paper are hard to read. The paper applies and adapts some idea about tensor networks that are used/developed by the Physics community to ML problems. The authors have done a good job in illustrating the idea and the intuition by many figures.


Reviews: Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning

Neural Information Processing Systems

This work proposes to use semi-supervised learning, in the form of an unsupervised loss term, for improving the regularization capacity of CNNs. The idea (and the proposed loss) is conceptually simple and enforces stability explicitly by minimizing the difference between predictions corresponding to the same input data point. The paper focuses mainly on the experimental side, devoting the largest part in presenting results when adding the new loss on standard supervised CNNs. This is the stronger aspect of this work, with the weaker being the lack (or the definition) of baselines and the lack of some form of theoretical justification, derivation or discussion. Novelty/originality: The main contribution is the application of the unsupervised loss term for controlling the stability of the predictions under transformations or stochastic variability.