Preventing Reward Hacking with Occupancy Measure Regularization