Goto

Collaborating Authors

 laplacian regularizer


Reviews: Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces

Neural Information Processing Systems

In the prior literature, they cited the low dimensional embedding methods is the reason of the poor performance of the embedding based methods. In this paper, the author proposed that the final score vector for the labels actually generated by highly non-linear transformation such as thresholding the scores. Thus it is not clear if the low-rank structure of the score vectors directly cause the low-rank on the label vectors. Furthermore, the author uses a simple neural network to mimic the low-dimensional embedding can attain near-perfect training accuracy but generalize poorly and suggesting that overfitting is the root cause of the poor performance of the embedding based methods. This is the first contribution of the paper which breaks the glass ceiling of embedding based methods.


On Consistency of Graph-based Semi-supervised Learning

Du, Chengan, Zhao, Yunpeng

arXiv.org Machine Learning

Graph-based semi-supervised learning is one of the most popular methods in machine learning. Some of its theoretical properties such as bounds for the generalization error and the convergence of the graph Laplacian regularizer have been studied in computer science and statistics literatures. However, a fundamental statistical property, the consistency of the estimator from this method has not been proved. In this article, we study the consistency problem under a non-parametric framework. We prove the consistency of graph-based learning in the case that the estimated scores are enforced to be equal to the observed responses for the labeled data. The sample sizes of both labeled and unlabeled data are allowed to grow in this result. When the estimated scores are not required to be equal to the observed responses, a tuning parameter is used to balance the loss function and the graph Laplacian regularizer. We give a counterexample demonstrating that the estimator for this case can be inconsistent. The theoretical findings are supported by numerical studies.


Semi-supervised Regression via Parallel Field Regularization

Lin, Binbin, Zhang, Chiyuan, He, Xiaofei

Neural Information Processing Systems

This paper studies the problem of semi-supervised learning from the vector field perspective. Many of the existing work use the graph Laplacian to ensure the smoothness of the prediction function on the data manifold. However, beyond smoothness, it is suggested by recent theoretical work that we should ensure second order smoothness for achieving faster rates of convergence for semi-supervised regression problems. To achieve this goal, we show that the second order smoothness measures the linearity of the function, and the gradient field of a linear function has to be a parallel vector field. Consequently, we propose to find a function which minimizes the empirical error, and simultaneously requires its gradient field to be as parallel as possible. We give a continuous objective function on the manifold and discuss how to discretize it by using random points. The discretized optimization problem turns out to be a sparse linear system which can be solved very efficiently. The experimental results have demonstrated the effectiveness of our proposed approach.