On the Partial Identifiability in Reward Learning: Choosing the Best Reward