Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems 

The dual problem is split into blocks of variables and each of these blocks is solved independently with an arbitrary optimizer. At the end of each iteration the (weighted) mean of the partial solutions is computed and shared among the nodes. The rate of convergence is analysed and it is shown that the algorithm converges with a linear rate when the used optimizer of the block does. Good results are shown on a set of problems. Quality: The paper is overall of high quality. The proofs appear to be correct and the experiments are reasonably well executed. However, I would have liked to see a few more results: On the one hand there is an effect of how the data points are distributed among the nodes and on the other hand there is a strong dependency on the strength of regularization. With small regularization, a big H will lead to a higher conflict between solutions, such that the convergence rate might be very small and a lot of computation is wasted.