A Non-Monolithic Policy Approach of Offline-to-Online Reinforcement Learning