Equilibrium Bandits: Learning Optimal Equilibria of Unknown Dynamics