Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

Open in new window