Coupled Training with Privileged Information and Unlabeled Data

Shi, Jiahao, Hagrass, Omar, Klusowski, Jason M.

May-25-2026–arXiv.org Machine Learning

In many prediction problems, we have extra information during training (for example, measurements that are expensive or slow to collect) that will not be available when the model is deployed. A common strategy is to first train a model that uses all training information, then use its predictions on unlabeled examples to train a second model that only uses the inputs available at test time. However, when the extra training-only information is weak or noisy, this Two-Stage approach can mislead the deployment model and even hurt accuracy. We propose a joint training method that learns the two models together, so the deployment model can benefit from the extra information only when it actually helps, instead of inheriting its mistakes. We provide guarantees that describe when joint training improves prediction accuracy and analyze a simple alternating training algorithm for large, high-dimensional models. Experiments on synthetic data and real-world prediction tasks show that our approach avoids these failures and robustly outperforms standard Two-Stage baselines.

artificial intelligence, machine learning, privileged information, (13 more...)

arXiv.org Machine Learning

May-25-2026

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.93)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Unsupervised or Indirectly Supervised Learning (0.52)
  - Inductive Learning (0.46)
  - Statistical Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found