Nonuniqueness and Convergence to Equivalent Solutions in Observer-based Inverse Reinforcement Learning