AITopics | positive-unlabeled learning

Collaborating Authors

positive-unlabeled learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary Material for " Partial Optimal Transport with Applications on Positive-Unlabeled Learning '

Neural Information Processing SystemsFeb-7-2026, 17:36:32 GMT

The proof involves 3 steps: 1. we first justify the definition of p and q in the extended problem formulation, and show that T It is straighforward to see that, by doing so, we ensure that Γ remains an admissible coupling (see Figure 1 for an illustration). Figure 1: Repartition of the mass for matrices Γ and T. Each of them has a total mass of q A 5 (with a constant A > 2ξ) the GW formulation involves pairs of points. This yields the following cases: Case 1: a > 0. In that case, φ(γ) is a convex function, whose minimum on [0, 1] is reached for γ We have φ(0) = c > 0 and φ(1) = a + b + c. The minimum is then obtained for 0 if a + b > 0, and 1 otherwise. This gives the desired result.

artificial intelligence, constraint, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Positive-Unlabeled Learning using Random Forests via Recursive Greedy Risk Minimization

Neural Information Processing SystemsDec-24-2025, 20:34:14 GMT

The need to learn from positive and unlabeled data, or PU learning, arises in many applications and has attracted increasing interest. While random forests are known to perform well on many tasks with positive and negative data, recent PU algorithms are generally based on deep neural networks, and the potential of tree-based PU learning is under-explored. In this paper, we propose new random forest algorithms for PU-learning. Key to our approach is a new interpretation of decision tree algorithms for positive and negative data as \emph{recursive greedy risk minimization algorithms}. We extend this perspective to the PU setting to develop new decision tree learning algorithms that directly minimizes PU-data based estimators for the expected risk. This allows us to develop an efficient PU random forest algorithm, PU extra trees. Our approach features three desirable properties: it is robust to the choice of the loss function in the sense that various loss functions lead to the same decision trees; it requires little hyperparameter tuning as compared to neural network based PU learning; it supports a feature importance that directly measures a feature's contribution to risk minimization. Our algorithms demonstrate strong performance on several datasets.

algorithm, positive-unlabeled learning, recursive greedy risk minimization, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Positive-Unlabeled Learning with Non-Negative Risk Estimator

Neural Information Processing SystemsNov-21-2025, 15:42:59 GMT

From only positive (P) and unlabeled (U) data, a binary classifier could be trained with PU learning, in which the state of the art is unbiased PU learning. However, if its model is very flexible, empirical risks on training data will go negative, and we will suffer from serious overfitting. In this paper, we propose a non-negative risk estimator for PU learning: when getting minimized, it is more robust against overfitting, and thus we are able to use very flexible models (such as deep neural networks) given limited P data. Moreover, we analyze the bias, consistency, and mean-squared-error reduction of the proposed risk estimator, and bound the estimation error of the resulting empirical risk minimizer. Experiments demonstrate that our risk estimator fixes the overfitting problem of its unbiased counterparts.

name change, non-negative risk estimator, positive-unlabeled learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Neural Information Processing SystemsNov-21-2025, 15:18:29 GMT

In PU learning, a binary classifier is trained from positive (P) and unlabeled (U) data without negative (N) data. Although N data is missing, it sometimes outperforms PN learning (i.e., ordinary supervised learning). Hitherto, neither theoretical nor experimental analysis has been given to explain this phenomenon. In this paper, we theoretically compare PU (and NU) learning against PN learning based on the upper bounds on estimation errors. We find simple conditions when PU and NU learning are likely to outperform PN learning, and we prove that, in terms of the upper bounds, either PU or NU learning (depending on the class-prior probability and the sizes of P and N data) given infinite U data will improve on PN learning. Our theoretical findings well agree with the experimental results on artificial and benchmark data even when the experimental setup does not match the theoretical assumptions exactly.

learning, positive-unlabeled learning, theoretical comparison, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Gang Niu, Marthinus Christoffel du Plessis, Tomoya Sakai, Yao Ma, Masashi Sugiyama

Neural Information Processing SystemsNov-21-2025, 09:25:41 GMT

We clarify this question in this paper.

artificial intelligence, estimation error, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

An Effective Flow-based Method for Positive-Unlabeled Learning: 2-HNC

Hochbaum, Dorit, Nitayanont, Torpong

arXiv.org Artificial IntelligenceNov-4-2025

In many scenarios of binary classification, only positive instances are provided in the training data, leaving the rest of the data unlabeled. This setup, known as positive-unlabeled (PU) learning, is addressed here with a network flow-based method which utilizes pairwise similarities between samples. The method we propose here, 2-HNC, leverages Hochbaum's Normalized Cut (HNC) and the set of solutions it provides by solving a parametric minimum cut problem. The set of solutions, that are nested partitions of the samples into two sets, correspond to varying tradeoff values between the two goals: high intra-similarity inside the sets and low inter-similarity between the two sets. This nested sequence is utilized here to deliver a ranking of unlabeled samples by their likelihood of being negative. Building on this insight, our method, 2-HNC, proceeds in two stages. The first stage generates this ranking without assuming any negative labels, using a problem formulation that is constrained only on positive labeled samples. The second stage augments the positive set with likely-negative samples and recomputes the classification. The final label prediction selects among all generated partitions in both stages, the one that delivers a positive class proportion, closest to a prior estimate of this quantity, which is assumed to be given. Extensive experiments across synthetic and real datasets show that 2-HNC yields strong performance and often surpasses existing state-of-the-art algorithms.

data mining, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.08212

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > Experimental Study (0.71)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Supplementary Material for " Partial Optimal Transport with Applications on Positive-Unlabeled Learning '

Neural Information Processing SystemsOct-2-2025, 09:33:25 GMT

The proof involves 3 steps: 1. A null 5 (with a constant A > 2ξ) the GW formulation involves pairs of points. This yields the following cases: Case 1: a > 0. In that case, φ (γ) is a convex function, whose minimum on [0, 1] is reached for γ Using the development in Section 1.2.2 of the supplemental, we can establish that The partial-OT computation is based on a augmented problem with a dummy point and, as such, is convex. On the contrary, the GW problem is non-convex and, although the algorithm is proved to converge, there is no guarantee that the global optimum is reached. The quality of the solution is therefore highly dependent on the initialization.

artificial intelligence, machine learning, nullp null 1, (17 more...)

Neural Information Processing Systems

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Author Feedback for paper " Partial Optimal Tranport with applications on Positive-Unlabeled Learning "

Neural Information Processing SystemsOct-2-2025, 09:33:07 GMT

"we plan to derive an extension of this work to PU learning in which the proportion of positives in the dataset will

application, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Positive-Unlabeled Learning for Control Group Construction in Observational Causal Inference

Tsoumas, Ilias, Bormpoudakis, Dimitrios, Sitokonstantinou, Vasileios, Askitopoulos, Athanasios, Kalogeras, Andreas, Kontoes, Charalampos, Athanasiadis, Ioannis

arXiv.org Artificial IntelligenceJul-22-2025

In causal inference, whether through randomized controlled trials or observational studies, access to both treated and control units is essential for estimating the effect of a treatment on an outcome of interest. When treatment assignment is random, the average treatment effect (ATE) can be estimated directly by comparing outcomes between groups. In non-randomized settings, various techniques are employed to adjust for confounding and approximate the counterfactual scenario to recover an unbiased ATE. A common challenge, especially in observational studies, is the absence of units clearly labeled as controls-that is, units known not to have received the treatment. To address this, we propose positive-unlabeled (PU) learning as a framework for identifying, with high confidence, control units from a pool of unlabeled ones, using only the available treated (positive) units. We evaluate this approach using both simulated and real-world data. We construct a causal graph with diverse relationships and use it to generate synthetic data under various scenarios, assessing how reliably the method recovers control groups that allow estimates of true ATE. We also apply our approach to real-world data on optimal sowing and fertilizer treatments in sustainable agriculture. Our findings show that PU learning can successfully identify control (negative) units from unlabeled data based only on treated units and, through the resulting control group, estimate an ATE that closely approximates the true value. This work has important implications for observational causal inference, especially in fields where randomized experiments are difficult or costly. In domains such as earth, environmental, and agricultural sciences, it enables a plethora of quasi-experiments by leveraging available earth observation and climate data, particularly when treated units are available but control units are lacking.

artificial intelligence, control unit, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2507.14528

Country:

North America > Canada > Ontario > Toronto (0.06)
Europe > Greece > Attica > Athens (0.05)
North America > United States > Texas > Crockett County (0.04)
(7 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Food & Agriculture > Agriculture (1.00)
Materials > Chemicals > Agricultural Chemicals (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

PU-Lie: Lightweight Deception Detection in Imbalanced Diplomatic Dialogues via Positive-Unlabeled Learning

Kuwar, Bhavinkumar Vinodbhai, Maurya, Bikrant Bikram Pratap, Gupta, Priyanshu, Choudhury, Nitin

arXiv.org Artificial IntelligenceJul-15-2025

Detecting deception in strategic dialogues is a complex and high-stakes task due to the subtlety of language and extreme class imbalance between deceptive and truthful communications. In this work, we revisit deception detection in the Diplomacy dataset, where less than 5% of messages are labeled deceptive. We introduce a lightweight yet effective model combining frozen BERT embeddings, interpretable linguistic and game-specific features, and a Positive-Unlabeled (PU) learning objective. Unlike traditional binary classifiers, PU-Lie is tailored for situations where only a small portion of deceptive messages are labeled, and the majority are unlabeled. Our model achieves a new best macro F1 of 0.60 while reducing trainable parameters by over 650x. Through comprehensive evaluations and ablation studies across seven models, we demonstrate the value of PU learning, linguistic interpretability, and speaker-aware representations. Notably, we emphasize that in this problem setting, accurately detecting deception is more critical than identifying truthful messages. This priority guides our choice of PU learning, which explicitly models the rare but vital deceptive class.

detection, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.09157

Country:

Europe > Italy (0.05)
Asia > Middle East > Republic of Türkiye (0.04)
Asia > India > NCT > New Delhi (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback