Composing Efficient, Robust Tests for Policy Selection