Extending a Quantum Reinforcement Learning Exploration Policy with Flags to Connect Four