Reliable and Efficient Amortized Model-based Evaluation