Model-Free Robust $\phi$-Divergence Reinforcement Learning Using Both Offline and Online Data