Scalable Centralized Deep Multi-Agent Reinforcement Learning via Policy Gradients