Anchor Points: Benchmarking Models with Much Fewer Examples