Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear Speedup