Reducing Overestimation Bias in Multi-Agent Domains Using Double Centralized Critics