Policy Gradient Guidance Enables Test Time Control