Goto

Collaborating Authors

 Massart, Pascal


Risk bounds for PU learning under Selected At Random assumption

arXiv.org Machine Learning

Positive-unlabeled learning (PU learning) is known as a special case of semi-supervised binary classification where only a fraction of positive examples are labeled. The challenge is then to find the correct classifier despite this lack of information. Recently, new methodologies have been introduced to address the case where the probability of being labeled may depend on the covariates. In this paper, we are interested in establishing risk bounds for PU learning under this general assumption. In addition, we quantify the impact of label noise on PU learning compared to standard classification setting. Finally, we provide a lower bound on minimax risk proving that the upper bound is almost optimal.


Kernel Projection Machine: a New Tool for Pattern Recognition

Neural Information Processing Systems

This paper investigates the effect of Kernel Principal Component Analysis (KPCA) within the classification framework, essentially the regularization properties of this dimensionality reduction method. KPCA has been previously used as a pre-processing step before applying an SVM but we point out that this method is somewhat redundant from a regularization point of view and we propose a new algorithm called Kernel Projection Machine to avoid this redundancy, based on an analogy with the statistical framework of regression for a Gaussian white noise model. Preliminary experimental results show that this algorithm reaches the same performances as an SVM.


Kernel Projection Machine: a New Tool for Pattern Recognition

Neural Information Processing Systems

This paper investigates the effect of Kernel Principal Component Analysis (KPCA)within the classification framework, essentially the regularization propertiesof this dimensionality reduction method. KPCA has been previously used as a pre-processing step before applying an SVM but we point out that this method is somewhat redundant from a regularization pointof view and we propose a new algorithm called Kernel Projection Machine to avoid this redundancy, based on an analogy with the statistical framework of regression for a Gaussian white noise model. Preliminary experimental results show that this algorithm reaches the same performances as an SVM.