On The Presence of Double-Descent in Deep Reinforcement Learning