A Direct $\tilde{O}(1/\epsilon)$ Iteration Parallel Algorithm for Optimal Transport