Efficient Use of heuristics for accelerating XCS-based Policy Learning in Markov Games