Goto

Collaborating Authors

 Supervised Learning


e96ed478dab8595a7dbda4cbcbee168f-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a simple latent factor model for one-shot learning with continuous outputs where very few observations are available. Specifically, it derives risk approximations in an asymptotic regime where the number of training examples is fixed and the number of features in the X space diverges. Based on principal component regression (PCR) estimator, two estimators including the bias-corrected estimator and the so-called oracle estimator are proposed and the bounds for the risks of these estimators are derived. These bounds provide insights into the significance of various parameters relevant to one-shot learning.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1233" "Title:","A Multiplicative Model for Learning Distributed Text-Based Attribute Representations" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes to incorporate side information for improving vector-space embedding of words via an attribute vector that modulates the word-projection matrices. One could simply think of word-projection tensors (although, in practice the tensors are factorized) where the attribute vector provide the loadings for the tensor slices. This is studied in the context of log-bilinear language models, but the basic idea should be applicable to other word embedding work. The theory part of the paper is very well-written. However, it is in the experimental section that things get somewhat muddier.



Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a new regularization method for structured prediction. The idea is relatively straightforward: a linear chain model is segmented into smaller subchains, each of which is added as an independent training example. Theorems are provided (with proofs in the supplement) showing how this regularization can reduce generalization risk and accelerate convergence rates. Empirical comparisons with state of the art approaches suggest that the resulting method is both faster and more accurate.




Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","157" "Title:","Object Localization based on Structural SVM using Privileged Information" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The method is effective for the object localization task and results in good improvements in localization accuracy. It looks like the authors' formulation of SSVM+ contains separate slack variables \xi_i for each example x_i and there are extra degrees of freedom. How many alternating iterations are required? When the parameter vectors w and w^* are far from the optimal solution, could this alternating inference procedure get stuck in bad local minima?




Predtron: A Family of Online Algorithms for General Prediction Problems

Neural Information Processing Systems

Modern prediction problems arising in multilabel learning and learning to rank pose unique challenges to the classical theory of supervised learning. These problems have large prediction and label spaces of a combinatorial nature and involve sophisticated loss functions. We offer a general framework to derive mistake driven online algorithms and associated loss bounds. The key ingredients in our framework are a general loss function, a general vector space representation of predictions, and a notion of margin with respect to a general norm. Our general algorithm, Predtron, yields the perceptron algorithm and its variants when instan-tiated on classic problems such as binary classification, multiclass classification, ordinal regression, and multilabel classification. For multilabel ranking and subset ranking, we derive novel algorithms, notions of margins, and loss bounds. A simulation study confirms the behavior predicted by our bounds and demonstrates the flexibility of the design choices in our framework.