Simple and optimal methods for stochastic variational inequalities, II: Markovian noise and policy evaluation in reinforcement learning