On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman