COVID-19 Pandemic Cyclic Lockdown Optimization Using Reinforcement Learning