dedpul
47b4f1bfdf6d298682e610ad74b37dca-Paper.pdf
Given only positive examples and unlabeled examples (from both positive and negative classes), we might hope nevertheless to estimate an accurate positiveversus-negative classifier. Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation(MPE)--determining the fraction of positive examples in the unlabeled data; and (ii)PU-learning--given such an estimate, learning the desired positive-versus-negative classifier.
DEDPUL: Method for Mixture Proportion Estimation and Positive-Unlabeled Classification based on Density Estimation
This paper studies Positive-Unlabeled Classification, the problem of semi-supervised binary classification in the case when Negative (N) class in the training set is contaminated with instances of Positive (P) class. We develop a novel method (DEDPUL) that simultaneously solves two problems concerning the contaminated Unlabeled (U) sample: estimates the proportions of the mixing components (P and N) in U, and classifies U. By conducting experiments on synthetic and real-world data we favorably compare DEDPUL with current state-of-the-art methods for both problems. We introduce an automatic procedure for DEDPUL hyperparameter optimization. Additionally, we improve two methods in the literature and achieve DEDPUL level of performance with one of them.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Research Report > Promising Solution (0.54)
- Research Report > Experimental Study (0.46)