On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems