Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans