Nonparametric Bayesian Learning of Other Agents' Policies in Interactive POMDPs