A Note on the Prediction-Powered Bootstrap

Jun-7-2024–arXiv.org Machine Learning

Black-box predictive models are increasingly used to generate efficient substitutes for gold-standard labels when the latter are difficult to come by. For example, predictions of protein structures are used as efficient substitutes for slow and expensive experimental measurements [3, 4, 8], and large language models are used to cheaply generate substitutes for scarce human annotations [5, 7, 14]. Prediction-powered inference (PPI) [1] is a recent framework for statistical inference that combines a large amount of machine-learning predictions with a small amount of real data to ensure simultaneously valid and statistically powerful conclusions. While PPI [1] (and its improvement PPI++ [2]) offers a principled solution to incorporating black-box predictions into the scientific workflow, its scope of application is still limited. The current analyses focus on certain convex M-estimators such as means, quantiles, and GLMs to ensure tractable implementation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

Jun-7-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.35)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found