Multi-Agent Reinforcement Learning for Heterogeneous Satellite Cluster Resources Optimization