Reinforcement Learning Based Self-play and State Stacking Techniques for Noisy Air Combat Environment