Independent policy gradient-based reinforcement learning for economic and reliable energy management of multi-microgrid systems