Prediction-Powered Inference Across Many Tasks for AI Evaluation & Social Science Research