MORPH: Design Co-optimization with Reinforcement Learning via a Differentiable Hardware Model Proxy