Efficient Evaluation of Large Language Models via Collaborative Filtering