SADA: Safe and Adaptive Inference with Multiple Black-Box Predictions
Shan, Jiawei, Dong, Yiming, Zhao, Jiwei
Real-world applications often face scarce labeled data due to the high cost and time requirements of gold-standard experiments, whereas unlabeled data are typically abundant. With the growing adoption of machine learning techniques, it has become increasingly feasible to generate multiple predicted labels using a variety of models and algorithms, including deep learning, large language models, and generative AI. In this paper, we propose a novel approach that safely and adaptively aggregates multiple black-box predictions with unknown quality while preserving valid statistical inference. Our method provides two key guarantees: (i) it never performs worse than using the labeled data alone, regardless of the quality of the predictions; and (ii) if any one of the predictions (without knowing which one) perfectly fits the ground truth, the algorithm adaptively exploits this to achieve either a faster convergence rate or the semiparametric efficiency bound. We demonstrate the effectiveness of the proposed algorithm through experiments on both synthetic and benchmark datasets.
Sep-29-2025
- Country:
- Asia > Middle East
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- Jordan (0.04)
- Israel > Tel Aviv District
- Europe
- Bulgaria > Sofia City Province
- Sofia (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Bulgaria > Sofia City Province
- North America > United States
- Florida > Palm Beach County
- Boca Raton (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- New York > New York County
- New York City (0.14)
- Tennessee > Davidson County
- Nashville (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Florida > Palm Beach County
- Oceania > New Zealand (0.04)
- Asia > Middle East
- Genre:
- Research Report
- Experimental Study (0.68)
- New Finding (0.46)
- Research Report
- Industry:
- Technology: