Overview
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper is concerned with Monte Carlo sampling based on the discretisation of SDEs. This is a particularly topical subject since there has been some interest lately in such techniques due to the fact that they allow for the use of stochastic gradients which are particularly appealing in some big data settings since they allow one to run algorithms with only partial evaluation of the likelihood/energy function. The paper is particularly well written and pedagogical. In additional it clarifies earlier contributions and provides a rigorous overview of the main results useful in this emerging area.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","406" "Title:","Learning Mixed Multinomial Logit Model from Ordinal Data" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: This paper extends the classic MultiNomial Logit (MNL) choice model to a general family of choice models named Mixed MNL, which can be seen as a parametric class of distributions over permutations (e.g., permutations of items according to user preference). The main contributions of the paper are (1) to identify sufficient conditions under which a mixed MNL can be learnt, and (2) to propose a two-phase algorithm to learn the proposed mixed MNL models in an efficient manner. Part of the interesting theoretical results shows that the model with r components can be learnt with sample size being polynomially in n (number of items of interest) and r (number of components). Quality: The problem choice modeling studied in this paper is a fundamental and critical problem to the social choice community, and the proposed model and algorithm for this problem are certainly of interest to the machine learning community.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper provides two algorithms based on the soft-thresholding method for estimating a penalized pseudo-likelihood graphical model. The coordinate-wise method seems a practical improvement of the current CONCORD method. Overall, this is a worthwhile addition to a booming literature on this issue. The method has computational complexity O(sp2), but clearly also depends on the starting value.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a probabilistic approach for learning the assignment of exercises to skills from student data, where student knowledge changes while exercises are being solved; the model also estimates the student knowledge while estimating the skill assignments. The paper uses a weighted CRP to model the assignment, incorporating expert labelings through the weighting. In simulation, the method recovers skill labelings with high accuracy, with little dependence on the expert labels, and across several datasets, the paper finds that skill labelings from this method result in higher prediction accuracy than other approaches. Overall, I found the paper to be clear and the proposed model is a relatively novel extension of existing methods.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper proposes a new regularization method for structured prediction. The idea is relatively straightforward: a linear chain model is segmented into smaller subchains, each of which is added as an independent training example. Theorems are provided (with proofs in the supplement) showing how this regularization can reduce generalization risk and accelerate convergence rates. Empirical comparisons with state of the art approaches suggest that the resulting method is both faster and more accurate.
Responses to Review #
We thank all the reviewers for the time and expertise invested in these reviews. Q: What is the meaning of every notation? Their corresponding lowercase letter refer to one instance in the set, e.g. Q: What is the relationship to other Transfer Learning/Imitation Learning method? Since there are no major flaws pointed out in the review, could the reviewer please raise the overall score?
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper presents a new recursive neural network architecture for semantic scene labeling and shows that it outperforms previous approaches on two standard datasets in terms of pixel accuracy. The paper is generally very well written and the proposed model seems quite natural and conceptually clean compared to its main competitors. My only major concern is that the paper doesn't separately evaluate the effects of the combiner and decombiner networks. An even simpler model could use the combiner network to recursively collapse everything to a single root node (as is done already) but then directly feed the output of the F_sem network along with the root node F_com features into each corresponding F_lab network.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: This paper proposes boosting algorithms and analyze them from an online learning perspective. First, they propose a boosting algorithm based on the update of the online mirror descent(MABoost). Then they show a smooth version of MABoost, i.e., a variant of MABoost which creates only smooth distributions over examples. Further, they propose sparse or lazy versions of MABoost and show their convergence proofs.