A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning