Efficient Online RL Fine Tuning with Offline Pre-trained Policy Only