CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias