Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees